Glossary · Definition
Token
A token is the basic unit of text an LLM reads and produces. Roughly 4 characters or 0.75 words on average for English; longer for code, shorter for languages with lots of subword tokens. APIs bill by token.
Definition
A token is the basic unit of text an LLM reads and produces. Roughly 4 characters or 0.75 words on average for English; longer for code, shorter for languages with lots of subword tokens. APIs bill by token.
What it means
Tokenization is the first step of every LLM request. The text is split into subword pieces using BPE (byte-pair encoding) or similar. 'Hello world' is 2 tokens. 'antidisestablishmentarianism' is 4-5. Different model families use different tokenizers — GPT, Claude, Gemini, Llama all differ by 10-30% on the same text. The OpenAI tiktoken library is the most-used reference.
Advertisement
Why it matters
Token count drives cost, context window utilization, and (sometimes) speed. Reducing tokens by 30% via prompt compression or caching saves real money on large-scale workloads. Most users undercount their token usage because they ignore tool definitions, system prompts, and reasoning traces.
Related free tools
Frequently asked questions
How many tokens in 1000 words?
About 1,300-1,400 tokens for English. Varies by tokenizer.
Are output tokens billed differently?
Yes — output is typically 4-5x more expensive than input across major providers.
Related terms
- DefinitionContext windowThe context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.
- DefinitionPrompt cachingPrompt caching is a feature where the AI provider stores frequently reused prompt prefixes (system messages, RAG context, few-shot examples) and bills cached reads at ~10% of normal input cost.