Agentic AI is the talk of the town and every company wants a piece of it. Here’s my understanding of the topic and how it can be useful to you.
What is Agentic AI?
In short, Agentic AIs are systems designed to operate autonomously, make decisions, and perform tasks without human intervention. These systems have agency, meaning they have the ability to act independently based on their programming, learning, and objectives. This is a significant concept in AI because it involves creating an AI that can:
- Perceive the environment: Agentic AI can gather data from its surroundings using inputs or sensors to understand the environment.
- Reason and Plan: It can process information, analyze solutions, and make plans to execute tasks to achieve a set goal.
- Take actions: Based on its reasoning, Agentic AI can execute actions to influence its environment and achieve desired outcome.
- Learn and adapt: These systems can improve their performance over time by learning from their experiences.
Agentic AIs are used in various applications like autonomous vehicles, robotic systems, and more.
Non-agentic vs Agentic AI
We are familiar with using LLMs like ChatGPT and Claude where we directly open the chat interface and start asking questions. This is called the “zero-shot prompting” approach where we do not give any examples to the LLM on how to respond to a given query.
The non-agentic workflow is usually a zero-shot one where you ask it, for example, to write an essay on a topic X in one go from start to finish without hitting backspace to correct itself. Tasks like these are hard for humans; we cannot write an essay without hitting backspace or without trying to improve as we type. But, these non-agentic systems are good at doing these tasks considering they are doing it in one go.
In contrast, the agentic workflow is very iterative. Figure 1 shows exactly this.
How do LLMs perform in an agentic workflow?
The best way to compare an LLM in a zero-shot (non-agentic) vs agentic workflow is by benchmarking it. Open AI released HumanEval, a code benchmarking dataset, used in the industry extensively to benchmark LLMs’ performance.
In figure 2, you can see that GPT-3.5’s zero-shot performance on HumanEval is not great, but when wrapped in an agentic workflow, it outperforms GPT-4. The improvement in GPT-3.5 when wrapped in an agentic workflow dwarves the zero-shot improvement from GPT-3.5 to GPT-4!
Example: LandingAI’s Vision Agent
Suppose you have a drone shot of a few surfers in a sea. This area of the sea is particularly dangerous because surfers often encounter sharks. These sharks are visible in the drone shot clearly. Given a prompt to mark the distance between the shark and people in a given video, the vision agent first divides the task into a sequence of steps.
It then, retrieves tools (function calls). These tools are nothing but python functions in this particular case where the choice of language is python.
Based on these steps we get fully autonomously generated code, that when run, results in the distance-marked video. Following is a simplified overview of the entire process:
How Vision Agent Works
The coder agent’s job is to take the prompt and return the code to achieve the goal. First, the planner builds a plan that lists out the steps needed to complete the task (like in figure 3). Once all the steps are listed out, it retrieves the needed tools for each step. Finally, it generates the code.
The tester agent’s job is to test the code generated by the coder agent. So the generated code from the coder agent is passed to the tester agent which the executes the test code. It can write these tests by itself from looking at the coder generated code. If the test fails, the tester feeds the output back to the coder to fix the issue. These kinds of agentic workflows can iterate and correct themselves.
Note: Currently the tester code for Vision Agent can only write tests for type checking.
Limitations and Failure Examples
Should I use Agentic AI going forward?
In short, if you are okay with the possible limitations, then yes! Even if you don’t code and have any task that is tedious and repetitive, it would be a great start to test these workflows out. So I would suggest to try it out and make a decision for yourself!
Some agentic platforms to start are CrewAI, AutoGen, and LangGraph. CrewAI is open-source and totally free to start!
Conclusion
Agentic AI is heading towards domination in the NLP and LLM domain. It is more flexible, iterative, and performs much better than LLMs on their own. It’s not apparent what the de-facto method of working with AI five years from now going to be, but it is clear that the industry is moving towards agents in general.
Don’t forget to give this post a like and share it to support me and get more articles like this frequently! You can connect with me right here.
Thanks a lot for reading! See you in the next one.
Note: The slides on this page do not belong to me. They are from this keynote by Andrew Ng during Snowflake Dev Day.