Blog Detaisl

A busy person’s Intro to AI Agents

Spring 2024 is here, and it’s not just the flowers that are blooming. It’s the season of AI agent frameworks, each one promising to “disrupt everything” more than the last. You can’t scroll through your feed without stumbling upon a flashy demo on a GitHub repo with thousands of stars, all gained overnight.

These repos promise that anyone, even your grandma who still uses Internet Explorer, can now build an entire app from a single prompt. And for some reason, a large majority of these demos are usually some variation of the classic snake game. On the other end of the spectrum are people with “exclusive” access or information that they “can’t share yet.” They tease that very soon, [insert-any-industry-here] is about to change forever. But they can’t tell you how or why. Yet.

Although this type of excitement isn’t rare in the AI world, not every AI-related news gets equal attention. In fact, there are only a few announcements that truly capture everyone’s interest. It’s always the new model releases from giants like OpenAI or Claude. And, of course, after the release of AutoGPT, AI agents have proven that they too can steal the spotlight.

However, unlike OpenAI’s model releases, which usually garner positive reactions from the general public, AI agents have a very polarizing effect. People divide themselves into two groups. The first group is either terrified of AI agents, already envisioning a future where they’re replaced by Terminator-like robots, or they wholeheartedly believe that AI agents will make them rich by increasing their productivity, leading them to launch a super profitable startup. Yes, you might be surprised, but these two belong to the same group of people. A group that usually overvalues ai agents, at least slightly.

On the other hand, there are people who simply choose to ignore AI agents because they don’t see how they differ from, let’s say, chatting with ChatGPT or using an app built on top of LangChain with simple RAG. For them, this is all just hype fueled by greedy influencers and even greedier companies.

So, who is right? What exactly are AI agents, and what can you really do with them? Let’s explore all of that in this article. Don’t worry if you’re not a machine learning expert. This is a gentle introduction to AI agents for everyone who’s interested.

The Roots of Rationality

At its core, an agent is just a fancy word for anything that can take action, whether it’s a human, an animal, or even a machine. The idea of intelligent agents has been around for centuries, but it’s only recently that we’ve started seeing them everywhere. Mostly thanks to debates around self-driving cars.

The story actually begins way back in ancient Greece with the one and only Aristotle. I won’t pretend that some of his beliefs were… A bit off. Like his belief that men have more teeth than women and the more teeth you have, the longer you live.

Aristotle’s thought on what it means to “act” or “achieve” goals were far more helpful: “We deliberate not about ends but about means.” In other words, being rational isn’t about choosing your goals, but about figuring out the best way to achieve them.

This simple idea kicked off centuries of debate about rationality and laid the groundwork for the AI agents we know and love today. Several centuries later (in the 9th century to be more precise), Al-Khwarizmi popularized step-by-step problem solving and became “father of algebra”. In the 12th century, someone in the West was given a task of translating his work “On the Calculation with Hindu Numerals” into Latin. Whoever the translator was, he/she clearly didn’t know what to do with author’s name and decided to translate “Al-Khwarizmi” as “Algoritmi”. And that’s how we got the word Algorithm.

Fast Forward

Fast forward to the 13th century, when a Spanish philosopher named Ramon Llull had a wild idea. He created a “machine” with spinning paper wheels covered in symbols, hoping it represent fundamental “truths” or “laws” of the world. Many consider him to be the father or computer science and theory.

Llull’s contraption didn’t quite take off, but it planted the seed for “computation”. A few centuries later, in the 1600s, the legendary mathematician Blaise Pascal took things to the next level with the world’s first calculator. Suddenly, machines could crunch numbers faster than any human — a crucial step towards the age of intelligent machines. To be fair, when I say “to crunch numbers” I mean, this machine could only do additions and subtractions which was quite impressive at the time.

The next big leap came in the 1800s with Ada Lovelace, a mathematical prodigy who saw the true potential of computing. She wrote the first-ever computer program for Charles Babbage’s Analytical Engine, a steam-powered computer that was way ahead of its time. Although the Engine never fully lived up to its potential, Lovelace’s vision of machines that could handle complex tasks set the stage for the AI revolution.

The Rise of AI

The 1950s marked the birth of artificial intelligence as we know it. In 1950, Alan Turing, the father of computer science, published a groundbreaking paper that asked the big question: “Can machines think?” Many thought that the answer is: NO. In order to prove skeptics wrong, Turing proposed an imitation test (now known as the Turing Test) where a machine tries to fool a human into thinking it’s also human through normal conversation.

A few years later, a group of scientists got together at Dartmouth College for a summer workshop that would change the world. They set out to build machines that could think like us. This historic meeting, led by computer scientist John McCarthy, kicked off the field of AI. They believed they just needed 2 months and 10 men to build a “smart” machine.

Symbolic AI was all about representing knowledge through abstract symbols and manipulating them according to strict rules, kind of like a super-advanced version of Aristotle’s logic. McCarthy and his pals believed that by combining enough of these symbols and rules, they could create machines that could reason, plan, and solve problems just like humans. This approach led to some pretty impressive systems in the 50s and 60s, like Dendral, MYCIN which could do things like interpret lab results and identify unknown molecules.

However, symbolic AI soon ran into some roadblocks. Turns out, the real world is a messy, complicated place that doesn’t always fit neatly into strict logical rules. Imagine trying to write down every single rule for making a sandwich ! As symbolic AI tackled more ambitious problems, its limitations became clearer. In the late 1960s and early 1970s, the field hit a bit of a rough patch known as the “first AI winter.” Funding dried up, progress slowed, and people started to lose faith in the grand promises of human-like AI. It was clear that symbolic logic alone wasn’t going to cut it — world needed a new approach.

Embracing Uncertainty

As the limitations of symbolic AI became clearer in the 1970s, researchers started exploring new ways to handle the uncertainty and complexity of the real world. Two key ideas emerged during this time: the use of probability and the rise of machine learning.

Let’s start with probability. In the 1980s, Bayesian networks hit the scene, allowing AI systems to “reason” about uncertainty using the language of probability. Instead of relying on strict logical rules, these networks could learn from data and make educated guesses when faced with incomplete information.

Meanwhile, machine learning was also making a comeback. In the 1980s, a new training technique called backpropagation breathed new life into neural networks, allowing them to learn complex patterns from data.

This shift towards probabilistic and learning-based approaches changed the game for AI agents. Instead of just reasoning with abstract symbols, agents could now learn from experience and adapt to new situations. It was like going from a rigid set of instructions to a flexible, ever-evolving understanding of the world.

This new paradigm powered breakthroughs in two key areas of machine learning: reinforcement learning and deep learning. Reinforcement learning is all about teaching agents to make smart decisions through trial and error, kind of like training a puppy with treats.

Deep learning, on the other hand, uses neural networks with many layers to learn rich, detailed representations of data, allowing agents to tackle complex tasks like image recognition and natural language processing.

These breakthroughs led to an expanded definition of ai agents. It wasn’t only about “reaching a goal successfully”. This new definition included terms like environment in which an agent perceives something and learns about the world.



So, what can AI agents actually do?

To clarify, this article focuses on Agents that use Large Language Models (LLMs) as their “brain”. While there are various types of agents, such as multimodal and visual agents, LLMs stand out due to their special capabilities.

Regardless of whether they are closed or open source, all LLMs possess varying levels of “Reflection” and “Common-Sense Reasoning” abilities, with some outperforming others. These crucial capabilities enable LLM Agents to make plans, engage in self-reflection, and continuously refine themselves, all stemming from the unique properties of LLMs.

Other than LLMs intrinsic abilities, there are 5 other important characteristics of Agent:

  • Ability to make Autonomous Actions. Agents can perform tasks independently, making decisions and taking actions without constant human intervention. However, it is ideal to have a human in the loop to maintain control and guide agents towards their objectives.

  • Memory Adding memory into an agent allows personalization, enabling it to understand and adapt to individual preferences. And as our preferences evolve throughout our lives, an agent with memory can learn and adjust. This is essential for building long-term relationships between agents and users.

  • Reactivity To interact with their environment, agents must be able to perceive and process the available information. This reactivity enables agents to respond to changes, make informed decisions, and provide relevant outputs based on the input they receive. By analyzing and interpreting the data within their environment, agents can offer context-aware help.

  • Proactivity Agents are not only capable of “planning”, “writing tasks” and prioritizing, but they are also able to take proactive steps to accomplish these tasks by using tools, such as search the internet, scrape reddit, use code interpreter etc. At the moment, this is mostly done through api calls and function calling.

  • Social Ability Agents can collaborate with other agents or humans, they can delegate work and they are capable of “sticking to their defined roles in the conversations”. This social ability enables agents to work collectively towards common goals, distribute workloads, and maintain coherent communication.

Comments

Leave a message here

We welcome your messages and feedback. Whether you have questions about our event schedule, need additional information, or simply want to share your thoughts, we're here to listen. Your input is valuable to us and helps us improve our services and offerings. Feel free to reach out with any comments, suggestions, or inquiries you may have. We're committed to providing you with the best possible experience and look forward to hearing from you.