On being an AI Enabler, Anthropic MCP, Google A2A and IBM ACP
Figuring Out AI Agents and Communication Protocols Together—Because You Don’t Need a PhD to win with AI
A lot has been said recently about AI and many people try to coin the title expert or thought leader. To be an expert in AI today, you need to be someone with profound experience in building the latest and greatest technology of our time - GenAI. Yet many of us don’t get to work in research or/and building these generative models on a daily basis.
So, if you don’t have a Phd in GenAI or are not looking to finish one tomorrow. This series of articles might be for you!
In this series, I am going to break down various concepts that will help you to become an AI enabler. So one day you can say “My expertise lies in bridging AI research with scalable, production-grade systems—enabling organizations to turn AI potential into business impact.”
So, who is an AI Enabler? An AI enabler is key to helping organizations turn AI potential into real business impact. Often possessing both software and business thinking.
To become an AI Enabler, one must understand the concept and new terminology.
In this post, you will learn about Agents and the agents communication protocols.
🤖 What is an AI Agent?
An AI Agent is an autonomous system designed to perceive its environment, make decisions, and take actions to achieve specific goals—often with minimal human intervention.
Unlike traditional AI models that respond to single prompts (like ChatGPT in a basic Q&A mode), AI Agents can plan, reason, and execute multi-step tasks by interacting with tools, data sources, APIs, or even other agents.
Sounds abstract? Yes, because it is. Essentially, looking at it from a software engineer point of view, these programs are now gaining autonomy and leverage AI models to help build a plan of execution. Their input is a goal and tools.
Let’s have a look at a practical example.
Let's say that our prompt is directly related to business revenue. We can ask the agent a simple question:
“Give me prediction for Q3 revenue for XYZ product”
When our agent awakens, it will first search for the tools under /tools directory. This directory will have guiding files to assess what is within its capabilities. For example:
/tools/list
/tools/list/Planner
/tools/list/GenSQL
/tools/list/ExecSQL
/tools/list/Judge
You can also look at it based on this diagram:
The main agent receiving the prompt acts as a controller. The controller has discovery and managing flow capabilities and it is communicating directly with its tools/other agents.
Here the first call from the controller will be to the planner, the planner will return an execution plan that will be reviewed by the judge. Later the controller will execute on the plan, leveraging GenSQL. Followed by ExecSQL. The final results will be sent to the Judge to decide if this is good enough or provide feedback to the planner to revisit and rerun the plan. As you can imagine, there are multiple events/messages between the controller and the rest of the agents. This is what we will refer to as communication. which raise the question of what are the existing Agent Communication Standards? how does that impacts AI Agents architecture?
So, what is this all about? Let’s dive into AI Agent Communication protocol.
There is a big battle happening now in the industry on the right way to standardize communication. How do we enable easy tools access, easy communication and easy processing of human interaction?
You might have heard about MCP, A2A, ACP. But what are those??!
MCP - Model Context Protocol
Invented by Anthropic. It is designed to standardize how AI agents and models manage, share, and utilize context across tasks, tools, and multi-step reasoning. It is a client-server architecture. Clients are the AI Applications which request information from servers. The servers provide access to external resources.
For example, imagine that an AI assistant needs to gain access to real-time flight information. With MCP, the application doesn’t need to know how to access the flight information. It simply requests the information from the server. Let’s assume the information is saved in a kafka topic.
What we can do is have all the data stored in Kafka topics, build a dedicated Kafka MCP server and have Claude acts as our MCP Client. Let’s take a look at the Akan example.
In this example, Akan asked Claud to connect to his Kafka Broker. Here is what the query looks like:
Behind the scenes, his client sends the request to the server, which translates it to the relevant Kafka function to run.
The response from the server could have been an error, yet this error is being translated to a text that will be easier on Akan to understand. In our case, there were no available topics. The client extended the capabilities and asked if Akan would like to create a topic with a dedicated number of partitions and replication (like a true Kafka hero!)
From here onwards, Akan can ask to create a ‘countries’ topic and later describe the kafka topic.
You can check out Akan mcp-kafka project here.
On the server side, you will need to define what the server can do, in the above project this is under the handler.go file. This file holds the list of functions the server can handle/execute on. Here is the CreateTopic example:
While this is an existing example with widely adopted technology - Apache Kafka. Anthropic generalize it more and define Hosts. Hosts are the LLM applications that initiate connections. Meaning that every host can have multiple clients as described in the diagram.
MCP architecture components - Anthropic website
I generally have some thoughts on how having multiple clients within one application complicates taking a solution to production and scaling it, but more on that in a more deep dive post.
Going forward, you can imagine how an MCP server for a database will have all the database functionalities exposed through a similar Handler. Or if you want to become more sophisticated, you can define existing prompt templates dedicated to your service. For example, for a healthcare database, you can have dedicated functions with patient health data. This simplifies the experience and provides prompt guardrails. Which could potentially restrict a user but enables more accuracy in results from the system. There is much more to learn and you can dive deeper into MCP at the official website.
A2A - Agent2Agent
Invented by Google. This is related to Google’s ADK(Agent Development Kit) yet a distinct component and not part of the ADK package. The idea here is that AI agents communicate, collaborate, and coordinate directly with each other to solve complex tasks beyond frameworks or vendors. Meaning opaque agentic applications.
The idea here, essentially, is that each agent will have an identity card - aka a metadata file /.well-known/agent.json
describing its capabilities, endpoint URL and authentication requirements. Here we again discuss a dedicated client and server. The client consumes an A2A service by sending requests to an A2A Server URL. The client is responsible to ask for a specific ask from the agent's cards.
The A2A Client is the application server that exposes HTTP endpoints that receive requests and manages the task execution.
The communication happens through messages and roles, and each message can be broken to part. A part can be a message or an artifact.
Here there is also a dedicated capability of Streaming. This is used for long running tasks. Where the server supporting streaming can have the client subscribe to a server-sent event(SSE). The event will usually contain the task status update, or artifact update. This is great for the A2A Client for receiving real time updates in an asynchronous non-blocking way.
Here are the core concepts as they are represented in the project Github repo:
Alright, here's a quick visual on how one healthcare provider agents communicates with another.
Credit: Workflow of MCP and A2A in HealthCare Domain by Gourav G.
In this diagram the communication happens using an A2A protocol between health providers from different regions over secure Apache Kafka channels. Using Kafka they can ensure data encryption, authorization (OAuth/JWT), and asynchronous transfer of structured health data.
Here is the full article describing the Healthcare use-case.
Here is the A2A github repo if you’d like to learn more.
ACP - Agent Communication Protocol
The last technology of the day is ACP. Invented by IBM. Yes the Eye, Bee, M company.
If you know, you know, and if you don't, well now you know ;)
IBM rebus logo by Paul Rand, 1981
ACP is part of a larger open source ecosystem named BeeAI 🐝. Its goal is to eliminate Agents vendor lock-in, speed up development and make it easy to discover community-built agents regardless of the implementation details.
ACP again describes the Server and Client architecture, where each agent has its own “metadata” defining its role. ACP named it Agent Detail.
Here is IBM.. oh I meant, BeeAI definition:
“In ACP, an agent is a software service that communicates through multimodal messages, primarily driven by natural language. The protocol is agnostic to how agents function internally, specifying only the minimum assumptions necessary for smooth interoperability.”
Notice how they call out here that a message can be multimodal. Multimodal in AI refers to the type/format of the data. Meaning it doesn’t have to be text, message can have an image, video, audio and so on.
Their execution functionality is the /run function. It represents a single execution of an agent with specific inputs. The execution can happen asynchronously in a stream manner providing flexibility or in a blocking synchronous manner.
Here are ACP core concepts as they are represented in the Github repo:
If you think about it carefully, ACP and A2A are kind of similar in nature. The one difference here is that ACP enables communications of Agents leveraging the BeeAI open-source frameworks while A2A was built for bridging communication of Agents from different frameworks.
Let’s take a deeper look into the BeeAI frameworks to understand the dependencies.
As of now, there are 3 core components:
BeeAI platform - discover, run, and compose AI agents..
BeeAI framework - building agents in Python or Typescript.
Agent Communication Protocol (ACP) - agent to agent communication,.
From exploring open source, it seems like ACP is still evolving by introducing discovery, delegation and multi-agent systems (MAS) orchestration features. The orchestration is done by the BeeAI Server, which is a local-first environment. It exposes a REST endpoint to external applications/UIs.
Semi-Brownie points to BeeAI and ACP - it has built-in telemetry and traceability tools. Meaning you can work with Arize Phoenix. Phoenix has a restrictive open-source licence - Elastic License 2.0 (ELv2) which means that every solution that you are building with it for building SaaS or other SaaS like products requires purchasing a one of many - tbd how many! from the company.
Fyi - Arize Phoenix is backed by the commercial company Arize AI. You can learn more about them through their docs.
Okay, that's it for now. We can dive deeper into ACP and the BeeAI framework later.
🏁 Sum it up!
AI Agents are essentially smart applications with autonomy, specific goals and tools.
Both Google and IBM released their open source agents communication protocols only recently in response to Anthropic's successful MCP project.
At a high level, each recent agent's communication protocols tackles a slightly different challenge:
=> MCP from Anthropic connects agents to tools and data.
=> A2A from Google standardizes agent-to-agent collaboration.
=> ACP from IBM focuses on BeeAI agent collaboration.
.. And that's a wrap! I hope this exploration of AI enablers, MCPs, A2A, and ACPs has been informative and valuable.
Perhaps you'd like to dive deeper into one of the specific roles we touched on today? Or maybe you're curious about the broader landscape of AI and its implications for the future of work?
I'm eager to continue this learning journey with you. Feel free to suggest any topics you'd like to explore next, and I'll do my best to provide insightful and thought-provoking content!
Would love to know more about building multimodal for payment solutions, online web and mobile application, how we can integrate voice , video and audio into payment solutions for those who are not educated but still prefer to use the application for their transaction.