Comparisons / LangChain vs LlamaIndex

LangChain vs LlamaIndex: Which Agent Framework to Use?

LangChain langchain is the most popular agent framework. LlamaIndex llamaindex started as a rag framework — connect your data, query it with an llm. Here is how they compare — and what the same patterns look like in plain Python.

By the numbers

LangChain

GitHub Stars

132.3k

Forks

21.8k

Language

Python

License

MIT

Created

2022-10-17

Created by

Harrison Chase

Backed by

Sequoia Capital, Benchmark

Funding

$25M Series A (2023), $25M Series B (2024)

Weekly downloads

3.5M

Cloud/SaaS

LangSmith (observability), LangServe (deployment)

Production ready

Yes

Used by: Notion, Elastic, Instacart

github.com/langchain-ai/langchain

LlamaIndex

GitHub Stars

48.3k

Forks

7.2k

Language

Python

License

MIT

Created

2022-11-02

Created by

Jerry Liu

github.com/run-llama/llama_index

GitHub stats as of April 2026. Stars indicate community interest, not necessarily quality or fit for your use case.

ConceptLangChainLlamaIndexPlain Python
AgentAgentExecutor with LLMChain, PromptTemplate, OutputParserAgentRunner with AgentWorker, or ReActAgent for tool-calling agentsA function that POSTs to /chat/completions and returns the response
Tools@tool decorator, StructuredTool, BaseTool class hierarchyFunctionTool for custom tools, QueryEngineTool to query an index as a toolA dict of callables: tools = {"add": lambda a, b: a + b}
Agent LoopAgentExecutor.invoke() with internal iterationAgentRunner.chat() manages step-by-step execution via AgentWorker tasksA while loop: call LLM, check for tool_calls, execute, repeat
ConversationConversationBufferMemory, ConversationSummaryMemoryA messages list that persists outside the function
StateLangGraph state channels with typed reducersA dict updated inside the loop: state["turns"] += 1
MemoryVectorStoreRetrieverMemory, ConversationEntityMemoryChatMemoryBuffer with token limit, or custom memory modulesA dict injected into the system prompt, saved via a remember() tool
GuardrailsOutputParser, PydanticOutputParser, custom validatorsTwo lists of lambda rules checked before and after the LLM call
RAG IntegrationVectorStoreIndex + QueryEngineTool — the agent can query your data as a tool callA tool function that embeds the query, searches a vector store, and returns top-k results
OrchestrationAgentRunner step API for custom control flow, or multi-agent pipelinesSequential function calls with results passed between them

What both do in plain Python

Every concept in the table above — agent, tools, loop, memory, state — maps to a handful of Python primitives: a function, a dict, a list, and a while loop. Both LangChain and LlamaIndex wrap these primitives in their own class hierarchies and APIs. The underlying pattern is the same ~60 lines of code. The difference is how much ceremony each framework adds on top.

When to use LangChain

LangChain adds value when you need production integrations (vector stores, specific LLM providers, deployment tooling). But if you want to understand what's happening — or your use case is straightforward — the plain Python version is easier to debug, modify, and reason about.

What LangChain does

LangChain provides a unifying interface across LLM providers, a class hierarchy for tools and memory, and orchestration via AgentExecutor and LangGraph. The core value proposition is interchangeable components: swap OpenAI for Anthropic by changing one class, plug in a vector store for retrieval, add memory without rewriting your loop. It also ships with dozens of integrations — document loaders, text splitters, embedding models, vector stores — that save you from writing boilerplate HTTP calls. For teams that need to compose many integrations quickly, this catalog is genuinely useful. The tradeoff is that you inherit a large dependency tree and a set of abstractions that sit between you and the actual API calls.

The plain Python equivalent

Every LangChain abstraction maps to a small piece of plain Python. AgentExecutor is a while loop that calls the LLM, checks for tool_calls in the response, executes the matching function from a tools dict, appends the result to a messages array, and repeats. Memory is a dict you inject into the system prompt. Output parsing is a function that validates the LLM's response before returning it. The entire agent — tool dispatch, conversation history, state tracking, guardrails — fits in about 60 lines of Python. No base classes, no decorators, no chain composition. Just a function, a dict, a list, and a loop. When something breaks, you read your 60 lines instead of navigating a class hierarchy.

Full LangChain comparison →

When to use LlamaIndex

LlamaIndex adds genuine value when your agent needs to query structured or unstructured data as part of its reasoning — that's the index-as-tool pattern, and it's well-executed. But if you're building a general-purpose agent that doesn't need RAG, the agent framework is overhead. The plain Python version of the agent loop is the same 60 lines either way.

What LlamaIndex agents do

LlamaIndex's agent system builds on its core strength: data indexing. You create a VectorStoreIndex over your documents, wrap it in a QueryEngineTool, and hand it to a ReActAgent. The agent can then query your data as a tool call — the same way it might call a calculator or web search. AgentRunner manages the execution loop: it sends messages to the LLM, parses tool calls, dispatches them (including index queries), and accumulates results. FunctionTool lets you wrap any Python function as a tool. The unique value over other frameworks is the tight integration between retrieval and agent reasoning — your data becomes a first-class tool, not an afterthought bolted onto a generic agent loop.

The plain Python equivalent

The agent loop is the same pattern as every other framework: a while loop that calls the LLM, checks for tool_calls, dispatches from a dict, and repeats. What LlamaIndex adds is the retrieval tool. In plain Python, that's a function: embed the query with an API call, search your vector store (Pinecone, pgvector, FAISS — all have simple clients), return the top-k chunks as a string. You put that function in your tools dict alongside everything else. The agent doesn't know or care that one tool queries an index — it's just another callable. The total code is about 60 lines for the agent loop plus 15-20 lines for the retrieval function. No AgentRunner, no AgentWorker, no QueryEngineTool.

Full LlamaIndex comparison →

Or build your own in 60 lines

Both LangChain and LlamaIndex implement the same 8 patterns. An agent is a function. Tools are a dict. The loop is a while loop. The whole thing composes in ~60 lines of Python.

No framework. No dependencies. No opinions. Just the code.

Build it from scratch →