Token pricing

The pricing mechanism used by large-language-models (LLMs) to charge for inference, typically calculated based on the volume of text processed.

Core Components

  • Input Tokens: The cost applied to the prompt and any provided context sent to the model.
  • Output Tokens: The cost applied to the text generated by the model (usually priced higher than input tokens).
  • Model Tiering: Cost variance based on model complexity, ranging from high-reasoning models (e.g., gemini-3-pro) to efficiency-optimized models.

Current Market Benchmarks

Source Notes