Published on 9 июня 2026 г.
How to Build Your First AI Agent in 2026: a Beginner's Guide
Build a working AI agent in pure Python — no frameworks, no magic. You write the LLM call, tool use, memory, and the agent loop yourself.
An AI agent is a loop: you send a message to a large language model (LLM), it either answers or asks to use a tool, you run the tool and send back the result, and you repeat until it's done. That's the whole idea — no framework required. Here's the three-line skeleton:
while response.stop_reason == "tool_use": # the model wants a tool
result = run_tool(block.name, block.input) # you execute it
messages.append(tool_result(result)) # feed it back, loop againBy the end of this guide you'll have a real agent in pure Python that can tell the time, do exact math, and hold a conversation — built from four primitives you'll understand line by line: the LLM call, tool use, memory, and the control loop. We'll use the Anthropic Python SDK and a cheap model so a full chat costs a fraction of a cent. Then I'll show you which production SDK to graduate to once the concepts click.
Why build it without a framework
In 2026 the loudest advice is "just use a framework." But developers keep getting burned by the magic — one viral thread this month was a freelancer paid to rip the AI out of a tool because nobody on the team understood what it did. Frameworks hide the loop, and a loop you don't understand is a loop you can't debug at 3 AM.
So we'll build the loop ourselves. It's about 60 lines. Once you've written it, every agent framework — Claude Agent SDK, OpenAI Agents SDK, Google ADK — stops being magic and becomes "oh, it's doing the thing I already did, with nicer types." That's the fastest way to actually understand agents.
Step 1: the LLM call
Install the SDK and set your key:
pip install anthropic
export ANTHROPIC_API_KEY=sk-ant-... # get one at console.anthropic.comNow the smallest possible call — text in, text out:
from anthropic import Anthropic
client = Anthropic() # reads ANTHROPIC_API_KEY from the environment
response = client.messages.create(
model="claude-haiku-4-5", # cheap + fast; swap for sonnet/opus later
max_tokens=1024,
messages=[{"role": "user", "content": "In one sentence: what is an AI agent?"}],
)
print(response.content[0].text)That's a chatbot, not an agent. The difference is that an agent can act — and for that it needs tools.
Step 2: give it a tool
An LLM can't read a clock or multiply large numbers reliably — it predicts text, it doesn't compute. So we hand it tools: plain Python functions, plus a JSON-Schema description so the model knows when to call them.
TOOLS = [
{
"name": "get_current_time",
"description": "Return the current date and time. Call when the user asks the time or date.",
"input_schema": {"type": "object", "properties": {}},
},
]
response = client.messages.create(
model="claude-haiku-4-5",
max_tokens=1024,
tools=TOOLS,
messages=[{"role": "user", "content": "What time is it right now?"}],
)
print(response.stop_reason) # -> "tool_use"The model doesn't run anything itself. It stops with stop_reason == "tool_use" and a tool_use block that says "please call get_current_time." Running it is your job — that's the security boundary, and it's a feature: you decide what actually executes.
from datetime import datetime
def get_current_time() -> str:
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")You then send the result back as a tool_result block carrying the same tool_use_id, and the model turns it into a normal sentence. Do that in a loop and you have an agent.
Step 3: the agent loop (the whole thing)
Here's the complete agent. Two tools, the loop that runs them, and a chat REPL. The messages list is the memory — every turn appends to it, so the model remembers the conversation.
# agent.py — a tiny AI agent in pure Python. No frameworks.
# Setup: pip install anthropic && export ANTHROPIC_API_KEY=sk-ant-...
import ast
import operator
from datetime import datetime
from anthropic import Anthropic
client = Anthropic()
MODEL = "claude-haiku-4-5" # cheap; swap for "claude-sonnet-4-6" or "claude-opus-4-8"
# --- 1. The tools: plain Python functions ----------------------------------
def get_current_time() -> str:
"""The model has no clock — this gives it one."""
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
_OPS = {
ast.Add: operator.add, ast.Sub: operator.sub,
ast.Mult: operator.mul, ast.Div: operator.truediv,
ast.Pow: operator.pow, ast.USub: operator.neg,
}
def _eval(node):
# A tiny, safe arithmetic evaluator — never eval() untrusted input.
if isinstance(node, ast.Constant):
return node.value
if isinstance(node, ast.BinOp):
return _OPS[type(node.op)](_eval(node.left), _eval(node.right))
if isinstance(node, ast.UnaryOp):
return _OPS[type(node.op)](_eval(node.operand))
raise ValueError("unsupported expression")
def calculate(expression: str) -> str:
"""Deterministic math — don't trust an LLM to multiply big numbers."""
return str(_eval(ast.parse(expression, mode="eval").body))
# --- 2. Describe the tools to the model (JSON Schema) ----------------------
TOOLS = [
{
"name": "get_current_time",
"description": "Return the current date and time. Call when the user asks the time or date.",
"input_schema": {"type": "object", "properties": {}},
},
{
"name": "calculate",
"description": "Evaluate a basic arithmetic expression like '23 * 47 + 10'. Call for any math.",
"input_schema": {
"type": "object",
"properties": {
"expression": {"type": "string", "description": "e.g. '2 ** 10 / 4'"},
},
"required": ["expression"],
},
},
]
def run_tool(name: str, tool_input: dict) -> str:
if name == "get_current_time":
return get_current_time()
if name == "calculate":
return calculate(tool_input["expression"])
return f"Unknown tool: {name}"
# --- 3. The agent loop -----------------------------------------------------
def agent_turn(messages: list) -> str:
"""Run one user turn to completion, executing tools until the model is done."""
for _ in range(10): # hard cap so a misbehaving model can't loop forever
response = client.messages.create(
model=MODEL,
max_tokens=1024,
tools=TOOLS,
messages=messages,
)
# Keep the assistant turn (incl. any tool_use blocks) in memory.
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason != "tool_use":
return "".join(b.text for b in response.content if b.type == "text")
# The model asked for tools. Run them, feed the results back, loop.
results = []
for block in response.content:
if block.type == "tool_use":
output = run_tool(block.name, block.input)
results.append({
"type": "tool_result",
"tool_use_id": block.id, # must match the tool_use block
"content": output,
})
messages.append({"role": "user", "content": results})
return "Stopped: too many tool calls."
# --- 4. A chat loop — the conversation IS the memory -----------------------
if __name__ == "__main__":
messages = [] # the whole conversation lives here
print("Agent ready. Ctrl-C to quit.\n")
while True:
messages.append({"role": "user", "content": input("you> ")})
print(f"agent> {agent_turn(messages)}\n")Run python agent.py and try "what's 4871 times 209, and what time is it?" — the model calls both tools, gets exact answers back, and replies in one sentence. You just built an agent.
Making memory last between runs
Right now memory dies when you close the script. To make it persist, save messages to a file on exit and load it on start — the model's content blocks are JSON-serializable:
import json, pathlib
STORE = pathlib.Path("memory.json")
messages = json.loads(STORE.read_text()) if STORE.exists() else []
# ... after the chat loop, or on exit:
STORE.write_text(json.dumps(messages, default=lambda o: o.model_dump()))That's the fourth primitive. Real agents swap this flat file for a database or a vector store, but the idea is identical: state you carry across calls.
Streaming, for a nicer feel
For chat UIs you usually want tokens to appear as they're generated. Same call, streamed:
with client.messages.stream(
model=MODEL,
max_tokens=1024,
messages=[{"role": "user", "content": "Explain AI agents to a 5-year-old."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print()What you'd graduate to
You now understand every moving part. A production SDK just packages these primitives with types, retries, and a built-in tool runner so you don't hand-write the loop:
| You wrote | An SDK gives you |
| ---------------------- | -------------------------------------- |
| messages.create call | the same call, fully typed |
| the tool_use loop | an automatic tool runner |
| the messages list | session / state helpers |
| run_tool dispatch | tools defined from function signatures |
When you're ready, the natural next steps are the Claude Agent SDK, the OpenAI Agents SDK, or Google ADK. Reach for one when you need production plumbing — not before. For learning, the 60 lines above beat any framework.
Common pitfalls (and the fixes)
- Pitfall: sending the
tool_resultwithout first appending the assistant'stool_useturn. Fix: always appendresponse.contenttomessagesbefore the results — the API needs to see the call the result answers, or it returns a 400. - Pitfall: a
tool_resultwhosetool_use_iddoesn't match thetool_useblock. Fix: copyblock.idverbatim into the result. - Pitfall: trusting the model to compute or to know the current time. Fix: that's exactly what tools are for — keep deterministic work in code.
- Pitfall: running model-provided input through
eval(). Fix: validate and sandbox every tool; never execute raw expressions (note the safeastevaluator above). - Pitfall: no cap on the loop. Fix: the
range(10)ceiling stops a runaway agent from billing you forever.
What's next
Add a tool that does something useful for you — read a file, hit an API, query a database — and you've got a real assistant. The pattern never changes: describe the tool, run it, feed the result back. When you outgrow the flat-file memory, that's your cue to reach for a framework.
Sources
-
Anthropic Python SDK — the
messages.createcall, streaming, and tool-use blocks used throughout. https://github.com/anthropics/anthropic-sdk-python -
Claude Tool Use (overview) —
tool_use/tool_resultschema,stop_reasonhandling, and the agentic loop. https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview Retrieved June 9, 2026. -
Claude Models Overview — current model IDs (
claude-haiku-4-5,claude-sonnet-4-6,claude-opus-4-8) and context windows. https://platform.claude.com/docs/en/about-claude/models/overview Retrieved June 9, 2026. -
Claude Pricing — Haiku 4.5 at ~$1/$5 per 1M input/output tokens (used for the cost estimate). Prices change — check the live page. https://platform.claude.com/docs/en/pricing Retrieved June 9, 2026.
Need an AI agent built for your product? Start a project — I build AI agents, MCP servers, and agentic payment systems that run on your own server.