This is the first episode in an ongoing series called But what exactly is ....
I usually learn better when I write things down. It forces me to find the right words and that usually helps to get the concept clearer. Without further ado, let's dive right in.
You'll hear people say agentic AI, AI agent, agentic assistant. The naming is a mess and nobody fully agrees. Think of it as a spectrum: a chatbot answers questions, an assistant helps when asked, and an agent goes off and does things on your behalf. Most of what's out there today sits somewhere in between. For this article, I'll just call it an agent and move on.
First contact
My introduction to large language models was through chat interfaces, namely ChatGPT. Having a software engineering background I wanted to build my own chat interface, mainly because I wanted to understand how the magic actually worked. I started playing around with the chat completions endpoint from OpenAI, building my own ChatGPT. It was just API call after API call. No magic involved.
Later I learned the real magic is with how the models are being created and trained, but that’s another story, and shall be told another time.
Fly, you tools!
The chat completions API got more powerful when function calling, erm tool calling, was introduced. On top of the turn-based message exchange, the model could now return a special tool call instruction. You could use that to fetch additional information, interact with files, or call other APIs. Everything was orchestrated by the service that called the chat completions API.
This was great. Instead of just generating text, the LLM could now reply with structured intentions. Rather than being stuck with the User → Model → User → Model loop, we could do multi-step loops — User → Model → Tool → Model → Tool → Model → User. With the introduction of tool calling, all the building blocks for agentic AI were there. It still took a while to make the jump, though.
But what exactly are AI agents?
From a conceptual perspective, an AI agent autonomously takes actions based on a high-level goal and remembers what it did across sessions. From a technical perspective, it's a while-loop with an LLM, some tools, and a growing context window.
So why did it take a while to get there? After all, the building blocks were all there. Well, function calling was not really reliable — hello hallucinated tool call, my old friend — context windows were rather small — wait, didn't I already fetch that information? — and who in their right mind would trust an LLM with access to their computer? And hey, have you seen the latest chatbot that I just built? Uh, shiny!
What does it take to turn our good ol' chat bot into an AI agent? Everything is already there. No secret sauce, no breakthrough invention. If you look at projects like OpenClaw, they all combine the same handful of ingredients.
My... precious...
When you start a new conversation with a chat bot, that session has no idea what you talked about in previous ones. Groundhog Day all over the place. When we give the LLM memories, the agent is primed with context from past conversations, including what worked, what didn't, and what you care about. There are different types of memory, just as we have short- and long-term memory, but that is worth it’s own article.
Look Mom... No hands
Tool calling gives the LLM the ability to "reach out". It can gather information from the web, run commands or control applications on your machine. Or read and reply to your emails. Or check your bank balance and go on a shopping spree. The more tools it has, the more autonomous it feels — and, yeah, the scarier it gets.
Wake up, Neo ...
While you start the interaction with a classical chat bot, AI agents can be invoked on a schedule. Want to read your morning digest of what happens in the world on the way to work? A scheduler invokes the assistant at a predefined time, and it can look up the latest news and have an overview ready for you. It doesn't have to be just mornings. You define the heartbeat of your agent.
Everything Everywhere All at Once
Ultimately it's also about how you interact with it. Instead of a textbox in a browser, you use your favorite chat application. The agent meets you where you already are — always there, a fingertip and a notification away.
We need to cook!
Now that we have our building blocks conceptually ready, we want to create our own AI agent. In the next installment, we will start simple, with a basic chat bot, and work our way up from there. It won't be Open Claw in the end, but the goal is to show you how to pull the rabbit out of the hat and not do a full Copperfield magic show.
Until next time.
P.S.: If you have questions in the meantime, feel free to reach out. I'm always happy to help and support where I can!