Token Cost and Pricing

Tokens are the billing unit

API pricing is almost always per token, counted separately for input and output. The model never charges per word or per character, so understanding tokens is understanding your bill.

Input versus output

Input tokens cover your prompt, system message, and any context you send.
Output tokens cover what the model generates.
Output is usually priced higher per token than input.

Rules of thumb

For English a token is often around three to four characters, so a token is loosely three quarters of a word. Other languages and code can be much denser, so always measure rather than assume.

Controlling spend

Trim redundant context and long system prompts.
Cap output length when you can.
Cache reusable prefixes if the provider supports it.
Measure with the real tokenizer, since estimates drift across models.

Key idea

You are billed per token with input and output counted separately, so measuring and trimming tokens directly controls cost.

Token Cost and Pricing

Tokens are the billing unit

Input versus output

Rules of thumb

Controlling spend

Key idea

Check yourself