Skip to content
Free Tool Arena

Glossary · Definition

Context window

The context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.

Updated May 2026 · 4 min read
100% in-browserNo downloadsNo sign-upMalware-freeHow we keep this safe →

Definition

The context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.

What it means

Context windows are measured in tokens (~4 characters or ~0.75 words each). Claude Sonnet 4.6 and Opus 4.7 have 1M tokens; Gemini 2.5/3 Pro have 2M; GPT-5 has 400k; DeepSeek V3.2 has 128k. The window includes EVERY token: system prompt + chat history + user message + tool definitions + the model's response. Models also degrade in quality near the max — most pros operate at 50-70% of rated context for production reliability.

Advertisement

Why it matters

Picking a model with too small a context window forces you to chunk documents, lose RAG context, or break agent loops. Conversely, paying for a 2M context model when you use 50k is wasted spend. Right-sizing the window to your actual workload is one of the bigger AI-cost levers.

Related free tools

Frequently asked questions

How big is 1M tokens?

About 750,000 words — roughly 7-8 average books, or a full medium-sized codebase.

What happens when I exceed the window?

The provider truncates the oldest content (most APIs) or refuses the request. Either way, content past the limit is invisible to the model.

Related terms