AI & LLMs · Guide · AI & Prompt Tools
How to Build a Multi-Agent System with CrewAI
Create a three-agent Python crew — Researcher, Writer, Editor — using four primitives and cost control tips. Start building your free system instantly online.
CrewAI is the most approachable framework for role-based multi-agent systems. Instead of one giant prompt juggling a dozen responsibilities, you describe a small team of agents — Researcher, Writer, Editor — give each one tools and a goal, and let them hand work back and forth.
This guide builds a working three-agent crew in Python, explains the four primitives (Agent, Task, Crew, Tool), and flags the failure modes you’ll hit once you put one on a schedule. Written April 2026.
Advertisement
When to reach for CrewAI
CrewAI is worth it when the task naturally decomposes into specialist roles that talk to each other. Good fits: content pipelines, research reports, customer-onboarding flows, multi-step analyses. Bad fits: a single API-call task (too heavy), tight low-latency work (too much overhead), deeply branching logic (use LangGraph — see our LangGraph guide).
Step 1 — Install
python -m venv .venv && source .venv/bin/activate
pip install crewai crewai-toolsSet OPENAI_API_KEY or ANTHROPIC_API_KEY depending on the model you’ll use. CrewAI is model-agnostic; you configure per-agent.
Step 2 — The four primitives
- Agent — a role with a goal, a backstory (short!), and an optional list of tools.
- Task — a specific unit of work given to an agent, with an expected output format.
- Crew — the orchestrator. Knows the agents, the tasks, and the process (sequential or hierarchical).
- Tool — a function any agent can call. Built-ins for web search, file I/O, etc.
Step 3 — A three-agent crew
The canonical example: research, write, edit. Paste into crew.py:
from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool
search_tool = SerperDevTool()
researcher = Agent(
role="Researcher",
goal="Find 3 recent, credible sources on the topic.",
backstory="You value citations and primary sources.",
tools=[search_tool],
verbose=True,
)
writer = Agent(
role="Writer",
goal="Produce a 400-word briefing from the research.",
backstory="You write like a tired editor: short, specific, no fluff.",
verbose=True,
)
editor = Agent(
role="Editor",
goal="Cut fluff, check claims, return a tightened version.",
backstory="You hate adverbs.",
verbose=True,
)
research_task = Task(
description="Research: AI agent frameworks used in production in 2026.",
agent=researcher,
expected_output="3 source URLs with one-line summaries.",
)
write_task = Task(
description="Draft a 400-word briefing using the research.",
agent=writer,
expected_output="A 400-word markdown briefing.",
context=[research_task],
)
edit_task = Task(
description="Edit the draft. Keep it under 400 words, no adverbs.",
agent=editor,
expected_output="Final briefing text.",
context=[write_task],
)
crew = Crew(
agents=[researcher, writer, editor],
tasks=[research_task, write_task, edit_task],
process=Process.sequential,
)
if __name__ == "__main__":
result = crew.kickoff()
print(result)Step 4 — Run it
python crew.pyYou’ll see the agents talk to each other in the console. Each task’s output flows into the next task’s context, which is why context=[previous_task] matters — without it, the next agent starts from nothing.
Step 5 — Tighten the prompts
CrewAI’s biggest trap is verbose backstories. Keep them to one or two sentences. A long backstory eats context and distracts the model. If your “Writer” agent is drifting, cut the backstory, not the goal.
Step 6 — Sequential vs. hierarchical
- Sequential (default) — tasks run in order; each feeds the next. Deterministic, easy to debug.
- Hierarchical — a manager agent delegates tasks dynamically. More flexible, harder to debug. Use it only once sequential stops expressing the workflow.
Step 7 — Cost control
Every agent is a separate model call. A 3-agent sequential crew with retries can easily hit 15–20 turns. Before you put one on a schedule:
- Run once with
verbose=Trueand count the turns. - Estimate per-run cost with our token counter.
- Cap
max_rpmon the crew to avoid runaway loops. - Set a per-task
max_iterfor safety.
Step 8 — Schedule it
Once the crew runs cleanly on your laptop, wrap it in a FastAPI handler, trigger it from a cron, or drop it into a Modal / Temporal worker. The agents stay the same; only the trigger changes. For long-running or stateful work, compare against LangGraph — it gives you retries and state persistence that CrewAI doesn’t.
Common mistakes
- Too many agents. Three or four is the sweet spot. Ten agents = ten prompt drift sources.
- Overlapping roles. If “Researcher” and “Analyst” could be the same person, they should be.
- No
expected_output. Without it, agents freestyle the format and the next agent struggles to parse. - Using CrewAI as a chat wrapper. If the task is one turn, just call the model directly.
Use these while you read
Tools that pair with this guide
- AI Token CounterEstimate tokens, characters, words, and approximate API cost for GPT-4o, GPT-4, Claude, and Gemini — before you hit send.AI & Prompt Tools
- AI Model ComparisonSide-by-side spec sheet of frontier models: context window, input/output price, multimodal support, strengths, and best-fit use cases.AI & Prompt Tools
- AI Art Style PickerBrowse 40+ art styles — photoreal, anime, watercolor, cyberpunk — with the exact prompt snippet to paste into any image generator.AI & Prompt Tools
- AI Prompt GeneratorTurn a vague idea into a structured prompt. Pick role, task, context, constraints, and output format. Works with ChatGPT, Claude, and Gemini.AI & Prompt Tools
Advertisement
Continue reading
- AI & LLMsGitHub Copilot Pricing and ComparisonCompare free vs paid GitHub Copilot tiers and analyze it against ChatGPT, Cursor, and Tabnine. Find the best value plan instantly with this free online guide.
- AI & LLMsGitHub Copilot Features and CapabilitiesTest what Copilot really does — code accuracy, scope limits, debugging, web dev, legacy code, tests, docs, team customization. Free guide, no sign-up.
- AI & LLMsGitHub Copilot Security and Data HandlingAudit where your code goes, who sees it, training-data policy, network needs, and what happens when Copilot suggests broken code. Free, no sign-up.
- AI & LLMsAI Fluency SkillsThe 8 sub-skills of AI fluency: prompt structure, model selection, tool use, quality calibration, iteration, context management, cost awareness, privacy.
- AI & LLMsAnthropic Skills ExplainedSkills as Anthropic's answer to Custom GPTs — markdown-defined, version-controlled in git, work in terminal. Anatomy + Skills vs Custom GPTs.
- AI & LLMsKimi K2 vs DeepSeek V3Two open-weight Chinese flagships. Kimi K2 = 1M context, DeepSeek V3.2 = top-tier reasoning + coding. Pick by use case.