The Agent Loop

Claude searching files, then reading, then searching again. That's this loop.

agent loopmulti-turntool protocolconvergence

Framework parallel: LangChain AgentExecutor, OpenAI Agents SDK, AutoGen — a while loop over messages.

The Agent Loop

This is the most important lesson. Everything after this builds on this loop.

You've seen this happen in Claude: you ask it to analyze a codebase, and it searches files, reads them, searches again, reads more — multiple steps before giving you an answer. Or ChatGPT with Code Interpreter: it writes code, runs it, sees an error, fixes it, runs again. That's this loop.

L2's agent made one tool call and stopped. Real agents loop: call a tool → see the result → decide what's next → repeat until done. The LLM decides when to stop. No tool_calls in the response = done.

This is the entire runtime of LangChain's AgentExecutor.

Step 1: Tools + ask_llm

Same tools as L2. But now ask_llm takes the full messages array and returns the raw message object — we need tool_calls and tool_call_id for the multi-turn protocol.

tools = {"add": lambda a, b: a + b, "upper": lambda text: text.upper()}
TOOL_DEFS = [
    {"type": "function", "function": {"name": "add", "description": "Add two numbers",
        "parameters": {"type": "object",
            "properties": {"a": {"type": "number"}, "b": {"type": "number"}}}}},
    {"type": "function", "function": {"name": "upper", "description": "Uppercase text",
        "parameters": {"type": "object",
            "properties": {"text": {"type": "string"}}}}},
]

async def ask_llm(messages):
    resp = await pyfetch(f"{LLM_BASE_URL}/chat/completions",
        method="POST",
        headers={"Authorization": f"Bearer {LLM_API_KEY}",
                 "Content-Type": "application/json"},
        body=json.dumps({"model": LLM_MODEL, "messages": messages, "tools": TOOL_DEFS}))
    return json.loads(await resp.string())["choices"][0]["message"]

Step 2: The loop

When Claude runs a multi-step task — say, searching your codebase, then reading files, then writing code — each step is one iteration of this loop:

Call LLM with the full messages array (everything so far)

No tool_calls? Return the answer — the LLM is done thinking

Has tool_calls? Execute each one, append results with tool_call_id, loop back

The tool_call_id links each result to its request — without it, when the LLM asks for two tools at once, there's no way to tell which result belongs to which call. This is the tool calling protocol — the wire format that makes multi-step work.

async def agent(task, max_turns=5):
    messages = [
        {"role": "system", "content": "Use tools to answer. Be concise."},
        {"role": "user", "content": task},
    ]
    for turn in range(max_turns):
        trace("llm_call", f"Turn {turn + 1}")
        msg = await ask_llm(messages)

        if not msg.get("tool_calls"):
            trace("agent_end", msg.get("content", ""))
            return msg.get("content", "")

        messages.append(msg)
        for tc in msg["tool_calls"]:
            name = tc["function"]["name"]
            args = json.loads(tc["function"]["arguments"])
            result = tools[name](**args)
            trace("tool_result", f"{name}({args}) → {result}")
            messages.append({
                "role": "tool",
                "tool_call_id": tc["id"],
                "content": str(result),
            })
    return "Max turns reached"

Try it

*"add 10 and 5"* — one tool call, one turn

*"add 3 and 4, then uppercase hello"* — two tool calls, the LLM chains them

Watch the diagram: each turn cycles through the loop. The messages array grows with each tool call.

print(f">> {await agent(USER_INPUT)}")