AI & LLMs · Guide · AI & Prompt Tools
How to Use LlamaIndex
Ingest documents into a VectorStoreIndex, create custom workflows, and parse complex PDFs with LlamaParse. Start building your RAG stack online for free.
LlamaIndex is the data framework for LLMs, purpose-built for ingesting documents and powering RAG over your private knowledge.
Advertisement
Where LangChain tries to be everything, LlamaIndex stays narrower and deeper: loaders for 150+ data sources, chunking and metadata extraction pipelines, a VectorStoreIndex abstraction over every vector DB that matters, and query engines that combine retrieval with re-ranking and response synthesis. A newer Workflows API adds event-driven orchestration for when you outgrow simple query pipelines.
What it is
LlamaIndex is MIT-licensed and maintained by LlamaIndex Inc. (Jerry Liu and team). The Python package llama-index-core is the base; integrations live in separate packages like llama-index-vector-stores-qdrant. A TypeScript port (llamaindex on npm) covers the essentials. LlamaParse, a paid managed service, handles complex PDFs and tables the OSS parser struggles with.
Install
pip install llama-index # or the modular install pip install llama-index-core llama-index-llms-openai llama-index-embeddings-openai
First run
Index a folder of documents and ask a question about them:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
engine = index.as_query_engine()
print(engine.query("What is our refund policy?"))Everyday workflows
- Build an IngestionPipeline with SentenceSplitter + TitleExtractor + embeddings and cache it to disk.
- Swap VectorStoreIndex’s backend for Qdrant, Pinecone, or pgvector with a few lines of config.
- Use Workflows and AgentWorkflow to combine RAG with tool-using agents for multi-step answers.
Gotchas and tips
RAG quality lives and dies by chunking. Default settings are generic; tune chunk_size and chunk_overlap to your content — contracts, forum posts, and code all want different values. Measure recall with a small labeled set before declaring victory.
Persistence is a common foot-gun. Calling from_documents every run re-embeds everything and bills you twice. Use StorageContext.persist() and load_index_from_storage, or push to a real vector DB that keeps state for you.
Who it’s for
Teams whose product is basically “chat with our documents” — legal, support, internal search, research. Tip: reach for LlamaParse the first time a client-provided PDF has merged cells or scanned tables — hand-rolling a parser for those is a month you will not get back.
Use these while you read
Tools that pair with this guide
- Embedding Cost EstimatorEstimate total tokens and cost for embedding a corpus online. Compare OpenAI, Voyage, Cohere, and more at once — free tool, instant results.AI & Prompt Tools
- AI Prompt GeneratorTurn a vague idea into a structured prompt. Pick role, task, context, constraints, and output format. Works with ChatGPT, Claude, and Gemini.AI & Prompt Tools
- AI Token CounterEstimate tokens, characters, words, and approximate API cost for GPT-4o, GPT-4, Claude, and Gemini — before you hit send.AI & Prompt Tools
- AI Prompt LibraryBrowse a curated catalog of prompt templates for writing, coding, marketing, and research. One click to copy.AI & Prompt Tools
Advertisement
Continue reading
- AI & LLMsGitHub Copilot Pricing and ComparisonCompare free vs paid GitHub Copilot tiers and analyze it against ChatGPT, Cursor, and Tabnine. Find the best value plan instantly with this free online guide.
- AI & LLMsGitHub Copilot Features and CapabilitiesTest what Copilot really does — code accuracy, scope limits, debugging, web dev, legacy code, tests, docs, team customization. Free guide, no sign-up.
- AI & LLMsGitHub Copilot Security and Data HandlingAudit where your code goes, who sees it, training-data policy, network needs, and what happens when Copilot suggests broken code. Free, no sign-up.
- AI & LLMsAI Fluency SkillsThe 8 sub-skills of AI fluency: prompt structure, model selection, tool use, quality calibration, iteration, context management, cost awareness, privacy.
- AI & LLMsAnthropic Skills ExplainedSkills as Anthropic's answer to Custom GPTs — markdown-defined, version-controlled in git, work in terminal. Anatomy + Skills vs Custom GPTs.
- AI & LLMsKimi K2 vs DeepSeek V3Two open-weight Chinese flagships. Kimi K2 = 1M context, DeepSeek V3.2 = top-tier reasoning + coding. Pick by use case.