Glossary · Definition
Hallucination (AI)
An AI hallucination is when an LLM generates content that's confident-sounding but factually wrong — invented citations, fake quotes, made-up APIs. The model doesn't 'know' it's wrong.
Definition
An AI hallucination is when an LLM generates content that's confident-sounding but factually wrong — invented citations, fake quotes, made-up APIs. The model doesn't 'know' it's wrong.
What it means
Hallucinations stem from how LLMs work: they predict the next token based on statistical patterns, not facts. When a fact wasn't represented strongly in training, the model fills with whatever pattern feels most plausible. Common types: invented citations (legal AI horror stories), wrong dates/numbers, fake API endpoints, plausible-sounding wrong code, made-up historical events. Frontier models hallucinate less than smaller ones but still do.
Advertisement
Why it matters
Production AI systems must mitigate hallucinations or they fail at scale. Fix patterns: ground outputs in retrieved sources (RAG), use structured outputs (JSON mode), require citations and verify them, prefer extraction over generation when possible, and have a 'I don't know' fallback. NEVER trust uncited dates, numbers, or quotes from any LLM.
Related free tools
Frequently asked questions
Which models hallucinate least?
In 2026: Claude Opus 4.7 + Sonnet 4.6 lead on faithfulness benchmarks. GPT-5 is competitive. Gemini 2.5/3 Pro is solid. All hallucinate to some degree.
Does RAG fix it?
It dramatically reduces hallucinations on factual questions but doesn't eliminate them. The model can still misinterpret retrieved context or fall back to training data when retrieval fails.
Related terms
- DefinitionRAG (Retrieval-Augmented Generation)RAG (Retrieval Augmented Generation) augments an LLM with documents retrieved at query time — typically from a vector database. The LLM grounds its answer in the retrieved text instead of relying purely on training data.
- DefinitionAI agentAn AI agent is an LLM running in a loop: think → call a tool → observe the result → think again. The loop continues until the task is done or a stopping condition is hit.