AI & LLMs · Guide · AI & Prompt Tools
How to Use Haystack
Installing haystack-ai, Pipelines, DocumentStore, retrievers, LLM generators, and running in production.
Haystack is deepset’s open-source Python framework for building production-grade LLM pipelines — RAG, agents, and search — with a clear component model.
Advertisement
Haystack has been around since before the ChatGPT era, when it focused on neural search. Haystack 2.0 (released 2024) modernised the API around typed components and pipelines, and it’s now one of the most production-focused alternatives to LangChain or LlamaIndex.
What it is
Pipelines are directed graphs of Components (retrievers, generators, rankers, converters) with typed input/output sockets. Document Stores (Elasticsearch, Weaviate, Qdrant, pgvector, OpenSearch, in-memory) hold the indexed content. Haystack ships first-party integrations for every major model provider and vector DB, plus a serverless option via deepset Cloud.
Install / sign up
# Core pip install haystack-ai # Integrations are separate packages pip install qdrant-haystack anthropic-haystack # Optional managed UI # https://cloud.deepset.ai
First session
A minimal RAG pipeline has three components: an embedding retriever, a prompt builder, and a generator. Wire them together and call run().
$ python
from haystack import Pipeline
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
p = Pipeline()
p.add_component("retriever", InMemoryEmbeddingRetriever(store))
p.add_component("prompt", PromptBuilder(template=tmpl))
p.add_component("llm", OpenAIGenerator(model="gpt-4o"))
p.connect("retriever", "prompt.documents")
p.connect("prompt", "llm")
print(p.run({"retriever": {"query_embedding": emb}}))Everyday workflows
- 1. Build a document-grounded Q&A service over your company’s wiki and Confluence exports.
- 2. Add a Ranker component after retrieval to boost precision before hitting the LLM.
- 3. Deploy pipelines behind Hayhooks (FastAPI wrapper) for a REST endpoint you can scale with Kubernetes.
Gotchas and tips
Haystack’s strength is that pipelines are serialisable YAML, which makes diffs and CI review easy. Keep prompts in templates, not hard-coded strings, so you can iterate without redeploying. Evaluation components (AnswerExactMatch, SASEvaluator) slot into the same pipeline graph so you can test in CI.
For very large corpora, favour Elasticsearch or OpenSearch document stores over in-memory — the InMemoryDocumentStore is great for tutorials but not production. Streaming responses require the streaming_callback parameter on generators; it’s easy to miss and it changes how you consume output.
Who it’s for
Teams shipping RAG or search-centric LLM products who want a typed, observable, deployable framework rather than a notebook-style toolkit.
Advertisement