Definition
An AI agent is a system in which an LLM plans, calls tools, observes results, and iterates toward a defined goal — rather than producing a single response — typically combining a planner LLM, a tool-use loop, and external capabilities (APIs, code execution, databases, web).
Agents differ from chatbots in autonomy. A chatbot answers; an agent acts. Given the goal "book a meeting with Sarah next week", an agent can check Sarah's availability, propose times, send the invite, and confirm — calling several tools in sequence, reasoning about each step's results, and recovering when something fails.
Agent design is harder than it looks. The planner LLM has to reason about which tool to call, with what arguments, given the current state. Errors compound — a wrong tool call early can derail an entire run. The most production-ready agents constrain the action space (a small menu of well-described tools), validate every output, and degrade gracefully to human handoff when stuck. Most "agent" problems in practice are actually deterministic workflow problems where a regular state machine outperforms an agent.
Origin
The AI-agent concept (autonomous goal-directed software) predates LLMs by decades — symbolic AI, multi-agent systems. Modern LLM-powered agents emerged with ReAct (2022), AutoGPT (2023), and the rapid maturation of tool-calling APIs across major LLMs through 2023–2025.
How it works
- Define the goal and the available tools (each with a name, description, and schema).
- On each iteration, the agent reasons about the next action given the goal and conversation so far.
- It calls a tool (or returns a final answer).
- The system executes the tool, returns the result to the agent.
- The agent reasons again, taking the result into account.
- Loop until the goal is achieved or a stop condition (max iterations, error budget) is hit.
When to use it
Use when
- Multi-step tasks where the steps depend on prior results.
- Workflows that require external tool use (database queries, APIs, file operations).
- Repetitive tasks where the per-run cost is low and the volume is high.
Skip when
- Single-step tasks — they're prompts, not agent runs.
- Mission-critical decisions where unbounded autonomy is unacceptable.
- When a deterministic workflow would do the same job more reliably.
Key metrics
- Task completion rate.
- Number of iterations per successful run.
- Cost per run (tokens × price × tools).
- Error rate by category (tool failure, planner error, validation failure).
- Human-handoff rate.
Examples
- The AI agent automates lead enrichment by querying three data sources and writing back to the CRM.
- An AI agent is a workflow with a brain — make sure the workflow is right first.
- The agent handles 80% of routine cases autonomously; the rest escalate to humans.
In practice at Makreate
Makreate builds AI agents that automate well-scoped, repeatable tasks — where the cost saved per run is greater than the model cost per run. A recent client wanted an agent to qualify inbound leads. We scoped the action space tightly (4 tools: enrichment lookup, ICP scoring, calendar check, CRM update), set a 5-iteration cap per lead, and validated every CRM write. The agent handles 73% of inbound leads autonomously at $0.12 per lead — with human review only on the 27% it routes for follow-up.
AI Web App Development →Common mistakes
- Building an autonomous agent before automating the deterministic version. Most "agent" problems are workflow problems.
- Too many tools. Fewer, well-described tools beat dozens of vague ones — the model picks better when the menu is short.
- No iteration cap. Runaway loops eat money and produce bad outputs.
- Skipping output validation. Agents can confidently write garbage.
- Optimising for autonomy when human handoff would be cheaper and safer.
Frequently asked
Agent or workflow?
Workflow if the steps are deterministic. Agent if the steps depend on intermediate results in ways you can't enumerate. Most production cases are workflows; agents are appropriate when the action space is genuinely open.
How many tools should an agent have?
Fewer than you think. 3–8 well-described tools usually beats 20 vague ones. Each additional tool dilutes the planner's ability to pick correctly.
How do I prevent runaway costs?
Iteration caps, token budgets per run, and tool-call rate limits. Monitor cost per successful task; alert when individual runs exceed thresholds.