Comparisons / AutoGen vs OpenAI Agents SDK
AutoGen vs OpenAI Agents SDK: Which Agent Framework to Use?
AutoGen autogen by microsoft models agents as conversableagents that chat with each other. OpenAI Agents SDK openai's agents sdk (evolved from swarm) provides agent, runner, handoffs, and guardrails. Here is how they compare — and what the same patterns look like in plain Python.
By the numbers
AutoGen
56.7k
8.5k
Python
CC-BY-4.0
2023-08-18
Microsoft Research
OpenAI Agents SDK
20.6k
3.4k
Python
MIT
2025-03-11
OpenAI
GitHub stats as of April 2026. Stars indicate community interest, not necessarily quality or fit for your use case.
| Concept | AutoGen | OpenAI Agents SDK | Plain Python |
|---|---|---|---|
| Agent | ConversableAgent with system_message, llm_config | Agent(name, instructions, model, tools) | A function with a system prompt that POSTs to the LLM API |
| Tools | register_for_llm() and register_for_execution() | Python functions with type hints, auto-converted to schemas | A dict of callables + JSON schema descriptions |
| Conversation | Two-agent chat with initiate_chat(), message history | — | A messages array that grows with each turn |
| Multi-Agent | GroupChat with GroupChatManager, speaker selection | — | Multiple agent functions called in sequence on shared messages |
| Nested Chats | register_nested_chats() for sub-task handling | — | A task queue (BFS) — agent schedules follow-ups via a tool |
| Termination | is_termination_msg callback, max_consecutive_auto_reply | — | The while loop exits when no tool_calls or max_turns reached |
| Agent Loop | — | Runner.run() handles the loop internally | A while loop: call LLM, execute tool_calls, repeat |
| Handoffs | — | Handoff between Agent objects for multi-agent routing | Call a different agent function based on the LLM's tool choice |
| Guardrails | — | InputGuardrail and OutputGuardrail with tripwire pattern | Two lists of rule functions checked before and after the LLM |
| Context | — | Typed context object passed through the agent lifecycle | A state dict updated inside the loop |
What both do in plain Python
Every concept in the table above — agent, tools, loop, memory, state — maps to a handful of Python primitives: a function, a dict, a list, and a while loop. Both AutoGen and OpenAI Agents SDK wrap these primitives in their own class hierarchies and APIs. The underlying pattern is the same ~60 lines of code. The difference is how much ceremony each framework adds on top.
When to use AutoGen
AutoGen excels at complex multi-agent workflows where agents need to debate or collaborate. For single-agent use cases or simple tool-calling agents, the plain Python version is significantly simpler.
What AutoGen does
AutoGen's core abstraction is the ConversableAgent — an agent that can send and receive messages. Two agents chat by alternating turns on a shared message history. GroupChat extends this to N agents, with a GroupChatManager that selects the next speaker (round-robin, random, or LLM-based selection). Nested chats allow an agent to spin up a sub-conversation to handle a complex subtask before returning to the main thread. AutoGen also provides code execution sandboxes, letting agents write and run code as part of their conversation. The framework thinks in terms of conversations, not chains or graphs. This makes it natural for workflows where agents need to debate, critique, or iteratively refine outputs together.
The plain Python equivalent
A ConversableAgent is a function that takes a messages array, calls the LLM with a system prompt, and returns the assistant message. Two-agent chat is a while loop where you alternate between calling agent_a(messages) and agent_b(messages), appending each response. GroupChat is the same loop but with a speaker selection step — either rotate through a list or ask the LLM "who should speak next?" and call that agent function. Nested chats are a function call within the loop: pause the main conversation, run a sub-loop with different agents, and inject the result back. Tool registration is adding functions to a tools dict with their JSON schemas. The conversation-as-primitive model is just messages arrays passed between functions.
When to use OpenAI Agents SDK
The Agents SDK is the thinnest framework on this list — it barely abstracts beyond what you'd write yourself. Use it when you want OpenAI's conventions and auto-schema generation. Skip it when you want full control or use non-OpenAI models.
What the OpenAI Agents SDK does
The Agents SDK (formerly Swarm) is OpenAI's opinionated take on agent architecture. It provides four primitives: Agent (system prompt + tools + model), Runner (the agent loop), handoffs (routing between agents), and guardrails (input/output validation). The key feature is auto-schema generation — write a Python function with type hints and the SDK converts it to a JSON tool schema automatically. Runner.run() handles the loop: call the model, check for tool calls, execute them, repeat. Handoffs let one agent transfer control to another by returning a special tool call. It's deliberately thin. OpenAI designed it as a reference implementation showing how agents should work with their API, not as a batteries-included framework.
The plain Python equivalent
The Agents SDK is already close to plain Python, which says something. Agent is a function that takes messages and returns a completion — the system prompt is the first message, tools are a dict. Runner.run() is a while loop: call openai.chat.completions.create(), check if the response has tool_calls, execute the matching functions from your tools dict, append results to messages, repeat until the model responds without tool_calls. Handoffs are an if-statement: if the model calls a "transfer_to_research" tool, call the research agent function instead. Guardrails are two lists of validation functions — run the input rules before calling the LLM, run the output rules after. The auto-schema generation is the only piece that takes more than a few lines to replicate.
Or build your own in 60 lines
Both AutoGen and OpenAI Agents SDK implement the same 8 patterns. An agent is a function. Tools are a dict. The loop is a while loop. The whole thing composes in ~60 lines of Python.
No framework. No dependencies. No opinions. Just the code.
Build it from scratch →