Glossary · Definition
Context engineering
Context engineering is designing everything an AI sees on a request — system prompt, retrieved documents (RAG), tool definitions, chat history, user message. The 2026 evolution beyond 'prompt engineering' (which focused on the user message alone).
Definition
Context engineering is designing everything an AI sees on a request — system prompt, retrieved documents (RAG), tool definitions, chat history, user message. The 2026 evolution beyond 'prompt engineering' (which focused on the user message alone).
What it means
The term emerged in 2024-2025 as agent + RAG systems matured. Concerns: how much context to pass, ordering for caching, when to compress vs prune, what to fetch via RAG vs include statically, how tool definitions burn tokens, how chat history accumulates. Modern AI products live or die on context engineering — same model, different context, dramatically different output quality.
Advertisement
Why it matters
By 2026, prompt engineering as a job title is fading because the prompt is just one input among many. Context engineering — managing the full picture an AI sees on a request — is the more durable skill. Most failures of production AI trace back to context errors: stale RAG, irrelevant retrieved docs, bloated history, badly-defined tools.
Related free tools
Frequently asked questions
Best practices?
Stable parts (system prompt, examples) at the START for caching. Dynamic per-request content at END. Aggressive RAG relevance filtering. Don't pass tool definitions you won't use. Compress / prune history past 30k tokens.
Tools to help?
LangSmith (visualize what your agent sees), Helicone, Phoenix (Arize). All log full context for debugging.
Related terms
- DefinitionContext windowThe context window is the maximum amount of text (in tokens) an AI model can process in a single request — combining your system prompt, conversation history, and output. Past the limit, the model can't 'see' earlier content.
- DefinitionPrompt cachingPrompt caching is a feature where the AI provider stores frequently reused prompt prefixes (system messages, RAG context, few-shot examples) and bills cached reads at ~10% of normal input cost.