Comparisons / CrewAI vs LlamaIndex
CrewAI vs LlamaIndex: Which Agent Framework to Use?
CrewAI crewai organizes work into agents, tasks, and crews. LlamaIndex llamaindex started as a rag framework — connect your data, query it with an llm. Here is how they compare — and what the same patterns look like in plain Python.
By the numbers
CrewAI
48.0k
6.5k
Python
MIT
2023-10-27
João Moura
LlamaIndex
48.3k
7.2k
Python
MIT
2022-11-02
Jerry Liu
GitHub stats as of April 2026. Stars indicate community interest, not necessarily quality or fit for your use case.
| Concept | CrewAI | LlamaIndex | Plain Python |
|---|---|---|---|
| Agent | Agent(role, goal, backstory, tools, llm) | AgentRunner with AgentWorker, or ReActAgent for tool-calling agents | A function with a system prompt and a tools dict |
| Tools | Tool registration with @tool decorator, custom Tool classes | FunctionTool for custom tools, QueryEngineTool to query an index as a tool | A dict: tools[name](**args) |
| Agent Loop | Internal to Agent execution, hidden from user | AgentRunner.chat() manages step-by-step execution via AgentWorker tasks | A while loop over messages with tool_calls check |
| Task Delegation | Crew(agents, tasks, process=sequential/hierarchical) | — | A task queue processed in a while loop with a budget cap |
| Memory | ShortTermMemory, LongTermMemory, EntityMemory | ChatMemoryBuffer with token limit, or custom memory modules | A dict injected into the system prompt |
| State | Task output passed between agents via Crew orchestration | — | A dict tracking tool calls and results |
| RAG Integration | — | VectorStoreIndex + QueryEngineTool — the agent can query your data as a tool call | A tool function that embeds the query, searches a vector store, and returns top-k results |
| Orchestration | — | AgentRunner step API for custom control flow, or multi-agent pipelines | Sequential function calls with results passed between them |
What both do in plain Python
Every concept in the table above — agent, tools, loop, memory, state — maps to a handful of Python primitives: a function, a dict, a list, and a while loop. Both CrewAI and LlamaIndex wrap these primitives in their own class hierarchies and APIs. The underlying pattern is the same ~60 lines of code. The difference is how much ceremony each framework adds on top.
When to use CrewAI
CrewAI shines for multi-agent setups where you want named roles ("researcher", "writer"). But the core mechanics — tool dispatch, the agent loop, task scheduling — are the same patterns you can build in plain Python.
What CrewAI does
CrewAI models multi-agent systems as a crew of specialists. Each Agent has a role ("Senior Researcher"), a goal ("Find the best data sources"), a backstory that shapes its behavior, and a set of tools it can use. Tasks define discrete units of work with expected outputs. The Crew orchestrates execution — sequentially, hierarchically, or with a custom process. CrewAI also provides memory systems (short-term, long-term, entity) and delegation, where one agent can hand off subtasks to another. The mental model is a team of people collaborating on a project. For prototyping multi-agent workflows where you want to reason about roles and responsibilities, it provides a clean vocabulary.
The plain Python equivalent
An Agent in CrewAI is a function with a system prompt that includes the role, goal, and backstory. The tools dict maps names to callables. Task delegation is a list of tasks processed in order — each task calls the assigned agent function with the task description appended to the messages. Hierarchical execution is a manager agent that decides which sub-agent to call next (just another tool choice). Memory is a dict injected into the system prompt. The entire crew pattern — multiple agents, task queue, delegation — is a for-loop over tasks, where each iteration calls the right agent function. No Crew class, no process kwarg. Just functions calling functions with a shared state dict passed between them.
When to use LlamaIndex
LlamaIndex adds genuine value when your agent needs to query structured or unstructured data as part of its reasoning — that's the index-as-tool pattern, and it's well-executed. But if you're building a general-purpose agent that doesn't need RAG, the agent framework is overhead. The plain Python version of the agent loop is the same 60 lines either way.
What LlamaIndex agents do
LlamaIndex's agent system builds on its core strength: data indexing. You create a VectorStoreIndex over your documents, wrap it in a QueryEngineTool, and hand it to a ReActAgent. The agent can then query your data as a tool call — the same way it might call a calculator or web search. AgentRunner manages the execution loop: it sends messages to the LLM, parses tool calls, dispatches them (including index queries), and accumulates results. FunctionTool lets you wrap any Python function as a tool. The unique value over other frameworks is the tight integration between retrieval and agent reasoning — your data becomes a first-class tool, not an afterthought bolted onto a generic agent loop.
The plain Python equivalent
The agent loop is the same pattern as every other framework: a while loop that calls the LLM, checks for tool_calls, dispatches from a dict, and repeats. What LlamaIndex adds is the retrieval tool. In plain Python, that's a function: embed the query with an API call, search your vector store (Pinecone, pgvector, FAISS — all have simple clients), return the top-k chunks as a string. You put that function in your tools dict alongside everything else. The agent doesn't know or care that one tool queries an index — it's just another callable. The total code is about 60 lines for the agent loop plus 15-20 lines for the retrieval function. No AgentRunner, no AgentWorker, no QueryEngineTool.
Or build your own in 60 lines
Both CrewAI and LlamaIndex implement the same 8 patterns. An agent is a function. Tools are a dict. The loop is a while loop. The whole thing composes in ~60 lines of Python.
No framework. No dependencies. No opinions. Just the code.
Build it from scratch →