# Understanding LLM API pricing
Large Language Model APIs charge based on token usage, not time or requests. Every API call has two costs: the input cost (processing your prompt) and the output cost (generating the response). Understanding this split is key to estimating your monthly bill accurately.# Input tokens vs output tokens
Input tokens
Input tokens represent everything sent to the model: your system prompt, conversation history, and user message. They are cheaper because the model processes them in parallel. A typical system prompt of 200 words costs roughly 267 input tokens.
Output tokens
Output tokens are generated one by one in sequence, making them more computationally expensive. Most providers charge 3–5× more for output tokens. A 300-word response generates roughly 400 output tokens. Keeping responses concise is one of the most effective cost-saving strategies.
# Choosing the right model for your budget
GPT-4o mini or Gemini 1.5 Flash and only upgrade if quality falls short. The cost difference between a small and large model can be 10–100×. claude-3-opus or o1 for complex reasoning tasks where quality directly affects outcomes.