Skip to content
Free Tool Arena

Glossary · Definition

Context window

The context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.

Updated June 2026 · 4 min read
100% in-browserNo downloadsNo sign-upMalware-freeHow we keep this safe →

Definition

The context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.

What it means

Context windows are measured in tokens (~4 characters or ~0.75 words each). Claude Sonnet 4.6 and Opus 4.7 have 1M tokens; Gemini 2.5/3 Pro have 2M; GPT-5 has 400k; DeepSeek V3.2 has 128k. The window includes EVERY token: system prompt + chat history + user message + tool definitions + the model's response. Models also degrade in quality near the max — most pros operate at 50-70% of rated context for production reliability.

Advertisement

Why it matters

Picking a model with too small a context window forces you to chunk documents, lose RAG context, or break agent loops. Conversely, paying for a 2M context model when you use 50k is wasted spend. Right-sizing the window to your actual workload is one of the bigger AI-cost levers.

Related free tools

Frequently asked questions

How big is 1M tokens?

About 750,000 words — roughly 7-8 average books, or a full medium-sized codebase.

What happens when I exceed the window?

The provider truncates the oldest content (most APIs) or refuses the request. Either way, content past the limit is invisible to the model.

Related terms

Found this useful?

The tools stay free thanks to readers who chip in or spread the word.

Buy Me a Coffee