Published on June 9, 2026

How to Build Your First AI Agent in 2026: a Beginner's Guide

Build a working AI agent in pure Python — no frameworks, no magic. You write the LLM call, tool use, memory, and the agent loop yourself.

ai-agentspythontutorialbeginnerllmclaude

An AI agent is a loop: you send a message to a large language model (LLM), it either answers or asks to use a tool, you run the tool and send back the result, and you repeat until it's done. That's the whole idea — no framework required. Here's the three-line skeleton:

while response.stop_reason == "tool_use":   # the model wants a tool
    result = run_tool(block.name, block.input)   # you execute it
    messages.append(tool_result(result))         # feed it back, loop again

By the end of this guide you'll have a real agent in pure Python that can tell the time, do exact math, and hold a conversation — built from four primitives you'll understand line by line: the LLM call, tool use, memory, and the control loop. We'll use the Anthropic Python SDK and a cheap model so a full chat costs a fraction of a cent. Then I'll show you which production SDK to graduate to once the concepts click.

Why build it without a framework

In 2026 the loudest advice is "just use a framework." But developers keep getting burned by the magic — one viral thread this month was a freelancer paid to rip the AI out of a tool because nobody on the team understood what it did. Frameworks hide the loop, and a loop you don't understand is a loop you can't debug at 3 AM.

So we'll build the loop ourselves. It's about 60 lines. Once you've written it, every agent framework — Claude Agent SDK, OpenAI Agents SDK, Google ADK — stops being magic and becomes "oh, it's doing the thing I already did, with nicer types." That's the fastest way to actually understand agents.

Step 1: the LLM call

Install the SDK and set your key:

pip install anthropic
export ANTHROPIC_API_KEY=sk-ant-...      # get one at console.anthropic.com

Now the smallest possible call — text in, text out:

from anthropic import Anthropic
 
client = Anthropic()   # reads ANTHROPIC_API_KEY from the environment
 
response = client.messages.create(
    model="claude-haiku-4-5",   # cheap + fast; swap for sonnet/opus later
    max_tokens=1024,
    messages=[{"role": "user", "content": "In one sentence: what is an AI agent?"}],
)
 
print(response.content[0].text)

That's a chatbot, not an agent. The difference is that an agent can act — and for that it needs tools.

Step 2: give it a tool

An LLM can't read a clock or multiply large numbers reliably — it predicts text, it doesn't compute. So we hand it tools: plain Python functions, plus a JSON-Schema description so the model knows when to call them.

TOOLS = [
    {
        "name": "get_current_time",
        "description": "Return the current date and time. Call when the user asks the time or date.",
        "input_schema": {"type": "object", "properties": {}},
    },
]
 
response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=1024,
    tools=TOOLS,
    messages=[{"role": "user", "content": "What time is it right now?"}],
)
 
print(response.stop_reason)   # -> "tool_use"

The model doesn't run anything itself. It stops with stop_reason == "tool_use" and a tool_use block that says "please call get_current_time." Running it is your job — that's the security boundary, and it's a feature: you decide what actually executes.

from datetime import datetime
 
def get_current_time() -> str:
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

You then send the result back as a tool_result block carrying the same tool_use_id, and the model turns it into a normal sentence. Do that in a loop and you have an agent.

Step 3: the agent loop (the whole thing)

Here's the complete agent. Two tools, the loop that runs them, and a chat REPL. The messages list is the memory — every turn appends to it, so the model remembers the conversation.

# agent.py — a tiny AI agent in pure Python. No frameworks.
# Setup:  pip install anthropic  &&  export ANTHROPIC_API_KEY=sk-ant-...
import ast
import operator
from datetime import datetime
 
from anthropic import Anthropic
 
client = Anthropic()
MODEL = "claude-haiku-4-5"   # cheap; swap for "claude-sonnet-4-6" or "claude-opus-4-8"
 
# --- 1. The tools: plain Python functions ----------------------------------
 
def get_current_time() -> str:
    """The model has no clock — this gives it one."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
 
_OPS = {
    ast.Add: operator.add, ast.Sub: operator.sub,
    ast.Mult: operator.mul, ast.Div: operator.truediv,
    ast.Pow: operator.pow, ast.USub: operator.neg,
}
 
def _eval(node):
    # A tiny, safe arithmetic evaluator — never eval() untrusted input.
    if isinstance(node, ast.Constant):
        return node.value
    if isinstance(node, ast.BinOp):
        return _OPS[type(node.op)](_eval(node.left), _eval(node.right))
    if isinstance(node, ast.UnaryOp):
        return _OPS[type(node.op)](_eval(node.operand))
    raise ValueError("unsupported expression")
 
def calculate(expression: str) -> str:
    """Deterministic math — don't trust an LLM to multiply big numbers."""
    return str(_eval(ast.parse(expression, mode="eval").body))
 
# --- 2. Describe the tools to the model (JSON Schema) ----------------------
 
TOOLS = [
    {
        "name": "get_current_time",
        "description": "Return the current date and time. Call when the user asks the time or date.",
        "input_schema": {"type": "object", "properties": {}},
    },
    {
        "name": "calculate",
        "description": "Evaluate a basic arithmetic expression like '23 * 47 + 10'. Call for any math.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "e.g. '2 ** 10 / 4'"},
            },
            "required": ["expression"],
        },
    },
]
 
def run_tool(name: str, tool_input: dict) -> str:
    if name == "get_current_time":
        return get_current_time()
    if name == "calculate":
        return calculate(tool_input["expression"])
    return f"Unknown tool: {name}"
 
# --- 3. The agent loop -----------------------------------------------------
 
def agent_turn(messages: list) -> str:
    """Run one user turn to completion, executing tools until the model is done."""
    for _ in range(10):   # hard cap so a misbehaving model can't loop forever
        response = client.messages.create(
            model=MODEL,
            max_tokens=1024,
            tools=TOOLS,
            messages=messages,
        )
        # Keep the assistant turn (incl. any tool_use blocks) in memory.
        messages.append({"role": "assistant", "content": response.content})
 
        if response.stop_reason != "tool_use":
            return "".join(b.text for b in response.content if b.type == "text")
 
        # The model asked for tools. Run them, feed the results back, loop.
        results = []
        for block in response.content:
            if block.type == "tool_use":
                output = run_tool(block.name, block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,   # must match the tool_use block
                    "content": output,
                })
        messages.append({"role": "user", "content": results})
    return "Stopped: too many tool calls."
 
# --- 4. A chat loop — the conversation IS the memory -----------------------
 
if __name__ == "__main__":
    messages = []   # the whole conversation lives here
    print("Agent ready. Ctrl-C to quit.\n")
    while True:
        messages.append({"role": "user", "content": input("you> ")})
        print(f"agent> {agent_turn(messages)}\n")

Run python agent.py and try "what's 4871 times 209, and what time is it?" — the model calls both tools, gets exact answers back, and replies in one sentence. You just built an agent.

Making memory last between runs

Right now memory dies when you close the script. To make it persist, save messages to a file on exit and load it on start — the model's content blocks are JSON-serializable:

import json, pathlib
 
STORE = pathlib.Path("memory.json")
messages = json.loads(STORE.read_text()) if STORE.exists() else []
# ... after the chat loop, or on exit:
STORE.write_text(json.dumps(messages, default=lambda o: o.model_dump()))

That's the fourth primitive. Real agents swap this flat file for a database or a vector store, but the idea is identical: state you carry across calls.

Streaming, for a nicer feel

For chat UIs you usually want tokens to appear as they're generated. Same call, streamed:

with client.messages.stream(
    model=MODEL,
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain AI agents to a 5-year-old."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    print()

What you'd graduate to

You now understand every moving part. A production SDK just packages these primitives with types, retries, and a built-in tool runner so you don't hand-write the loop:

You wrote	An SDK gives you
`messages.create` call	the same call, fully typed
the `tool_use` loop	an automatic tool runner
the `messages` list	session / state helpers
`run_tool` dispatch	tools defined from function signatures

When you're ready, the natural next steps are the Claude Agent SDK, the OpenAI Agents SDK, or Google ADK. Reach for one when you need production plumbing — not before. For learning, the 60 lines above beat any framework.

Common pitfalls (and the fixes)

Pitfall: sending the tool_result without first appending the assistant's tool_use turn. Fix: always append response.content to messages before the results — the API needs to see the call the result answers, or it returns a 400.
Pitfall: a tool_result whose tool_use_id doesn't match the tool_use block. Fix: copy block.id verbatim into the result.
Pitfall: trusting the model to compute or to know the current time. Fix: that's exactly what tools are for — keep deterministic work in code.
Pitfall: running model-provided input through eval(). Fix: validate and sandbox every tool; never execute raw expressions (note the safe ast evaluator above).
Pitfall: no cap on the loop. Fix: the range(10) ceiling stops a runaway agent from billing you forever.

What's next

Add a tool that does something useful for you — read a file, hit an API, query a database — and you've got a real assistant. The pattern never changes: describe the tool, run it, feed the result back. When you outgrow the flat-file memory, that's your cue to reach for a framework.

Sources

Anthropic Python SDK — the messages.create call, streaming, and tool-use blocks used throughout. https://github.com/anthropics/anthropic-sdk-python
Claude Tool Use (overview) — tool_use / tool_result schema, stop_reason handling, and the agentic loop. https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview Retrieved June 9, 2026.
Claude Models Overview — current model IDs (claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-8) and context windows. https://platform.claude.com/docs/en/about-claude/models/overview Retrieved June 9, 2026.
Claude Pricing — Haiku 4.5 at ~$1/$5 per 1M input/output tokens (used for the cost estimate). Prices change — check the live page. https://platform.claude.com/docs/en/pricing Retrieved June 9, 2026.

Need an AI agent built for your product? Start a project — I build AI agents, MCP servers, and agentic payment systems that run on your own server.