What is Microsoft AutoGen?

AutoGen is Microsoft's multi-agent framework that models AI agents as ConversableAgents that chat with each other. It supports two-agent conversations, GroupChat with multiple agents, and nested chats for sub-tasks. The core mechanic is a messages array passed between agent functions.

How does AutoGen compare to LangChain?

AutoGen focuses on multi-agent conversations where agents debate and collaborate. LangChain focuses on single-agent tool use with broad integrations. AutoGen excels at complex multi-turn agent interactions; LangChain excels at RAG pipelines and provider-agnostic tooling.

Can I build multi-agent systems without AutoGen?

Yes. Multi-agent systems in plain Python are multiple agent functions called in sequence on shared messages. A GroupChat is a for-loop over agent functions. Nested chats are a task queue. AutoGen's value is in dynamic speaker selection and conversation management — patterns you rarely need for straightforward workflows.

What is AutoGPT and how does it work?

AutoGPT is an autonomous AI agent that takes a goal, breaks it into subtasks, and executes them in a loop using LLM calls, web browsing, file operations, and code execution. The core is a think-plan-act-observe cycle that repeats until the goal is met or the agent gets stuck.

Can I build an autonomous agent without AutoGPT?

Yes. The autonomous agent pattern is a while loop that calls an LLM, parses an action from the response, executes it from a tools dict, and appends the result to message history. AutoGPT wraps this in a plugin system and vector DB, but the core logic is about 60 lines of Python.

Why does AutoGPT use so many API tokens?

AutoGPT runs an unbounded autonomous loop — each iteration makes at least one LLM call, plus optional self-critique calls. For simple tasks, it can make 20-50 calls that a bounded agent would handle in 3-5. The token cost comes from the loop, not the agent logic itself.

Comparisons / AutoGen vs AutoGPT

AutoGen vs AutoGPT: Which Agent Framework to Use?

AutoGen autogen by microsoft models agents as conversableagents that chat with each other. AutoGPT autogpt was one of the first autonomous agent projects, spawning 165k+ github stars. Here is how they compare — and what the same patterns look like in plain Python.

By the numbers

AutoGen

GitHub Stars

56.7k

Forks

8.5k

Language

Python

License

CC-BY-4.0

Created

2023-08-18

Created by

Microsoft Research

github.com/microsoft/autogen →

AutoGPT

GitHub Stars

183.1k

Forks

46.2k

Language

Python

License

MIT

Created

2023-03-16

Created by

Toran Bruce Richards

github.com/Significant-Gravitas/AutoGPT →

GitHub stats as of April 2026. Stars indicate community interest, not necessarily quality or fit for your use case.

Concept	AutoGen	AutoGPT	Plain Python
Agent	`ConversableAgent` with `system_message`, `llm_config`	AutoGPT `Agent` class with goal decomposition and self-prompting loop	A function with a system prompt that POSTs to the LLM API
Tools	`register_for_llm()` and `register_for_execution()`	Plugin system with web browsing, file I/O, code execution, Google search	A dict of callables + JSON schema descriptions
Conversation	Two-agent chat with `initiate_chat()`, message history	—	A `messages` array that grows with each turn
Multi-Agent	`GroupChat` with `GroupChatManager`, speaker selection	—	Multiple agent functions called in sequence on shared `messages`
Nested Chats	`register_nested_chats()` for sub-task handling	—	A task queue (BFS) — agent schedules follow-ups via a tool
Termination	`is_termination_msg` callback, `max_consecutive_auto_reply`	—	The `while` loop exits when no `tool_calls` or `max_turns` reached
Agent Loop	—	Autonomous loop: think → plan → act → observe → repeat until goal met	A `while` loop: call LLM, parse action, execute tool, append result, repeat
Memory	—	Vector DB (Pinecone/local) for long-term memory, message history for short-term	A list for recent messages, a dict for facts injected into the system prompt
Planning	—	GPT-4 generates multi-step plans, stores in task queue, revises on failure	Ask the LLM to return a JSON list of steps, iterate through them
Self-Critique	—	Built-in self-evaluation prompt that critiques each action before executing	A second LLM call: `'Review this plan and list problems'` before acting

What both do in plain Python

Every concept in the table above — agent, tools, loop, memory, state — maps to a handful of Python primitives: a function, a dict, a list, and a while loop. Both AutoGen and AutoGPT wrap these primitives in their own class hierarchies and APIs. The underlying pattern is the same ~60 lines of code. The difference is how much ceremony each framework adds on top.

When to use AutoGen

AutoGen excels at complex multi-agent workflows where agents need to debate or collaborate. For single-agent use cases or simple tool-calling agents, the plain Python version is significantly simpler.

What AutoGen does

AutoGen's core abstraction is the `ConversableAgent` — an agent that can send and receive messages. Two agents chat by alternating turns on a shared message history. `GroupChat` extends this to N agents, with a `GroupChatManager` that selects the next speaker (round-robin, random, or LLM-based selection). **Nested chats** allow an agent to spin up a sub-conversation to handle a complex subtask before returning to the main thread. AutoGen also provides code execution sandboxes, letting agents write and run code as part of their conversation. The framework thinks in terms of **conversations, not chains or graphs**. This makes it natural for workflows where agents need to debate, critique, or iteratively refine outputs together.

The plain Python equivalent

A `ConversableAgent` is a function that takes a `messages` array, calls the LLM with a system prompt, and returns the assistant message. Two-agent chat is a `while` loop where you alternate between calling `agent_a(messages)` and `agent_b(messages)`, appending each response. `GroupChat` is the same loop but with a **speaker selection step** — either rotate through a list or ask the LLM "who should speak next?" and call that agent function. Nested chats are a function call within the loop: pause the main conversation, run a sub-loop with different agents, and inject the result back. Tool registration is adding functions to a `tools` dict with their JSON schemas. The conversation-as-primitive model is **just `messages` arrays passed between functions**.

Full AutoGen comparison →

When to use AutoGPT

AutoGPT pioneered the autonomous agent pattern, but most of its complexity comes from managing an unbounded loop — not from the core agent logic. For bounded tasks, a plain while loop with tool dispatch gives you the same capability with full control over when to stop.

What AutoGPT does

AutoGPT takes a high-level goal and **autonomously breaks it into subtasks**, executes them, and evaluates progress. The agent runs in a continuous loop: it thinks about what to do next, creates a plan, executes an action (web search, file write, code execution), observes the result, and decides whether to continue or revise. It stores results in a vector database for long-term memory and uses message history for short-term context. The plugin system lets you add capabilities like web browsing, Google search, and file management. With **165k+ GitHub stars**, it proved that LLMs could drive autonomous workflows — but it also revealed the fundamental challenge: **unbounded loops that burn tokens** without clear stopping criteria.

The plain Python equivalent

The core AutoGPT pattern is a `while` loop that calls an LLM with a goal-oriented system prompt, parses the response for an action to take, executes that action from a `tools` dict, appends the result to the message history, and repeats. Planning is just asking the LLM to return a JSON list of steps. Self-critique is a second LLM call that reviews the plan. Memory is a list of messages plus a dict of facts you inject into the prompt. The entire autonomous agent fits in about **60 lines** — the hard part was never the code, it was **designing prompts that keep the agent focused** and knowing when to stop. You get the same loop, minus the plugin system overhead.

Full AutoGPT comparison →

Or build your own in 60 lines

Both AutoGen and AutoGPT implement the same 8 patterns. An agent is a function. Tools are a dict. The loop is a while loop. The whole thing composes in ~60 lines of Python.

No framework. No dependencies. No opinions. Just the code.

Build it from scratch →