Understanding MCP (Model Context Protocol)

3 min read

Cover Image for Understanding MCP (Model Context Protocol)

Large Language Models are stateless. This means, they do not have any ability to remember what was asked to them in past. Every query to LLM is a fresh new query. Yet, you will notice that you can chat with the model and model seems to know what you had asked it in past. This is called “context management”. In this article we are going to explore this idea.

Imagine you ask a question to LLM about Santa Fe. The model thinks you are asking about the city and gives you a lengthy explanation. In your chat window you respond with “No I meant the car”. The model than apologizes and responds with Santa Fe Car’s information. You think you have been coherently chatting with the model. In reality what is happening behind the scene is totally different. When you ask your second question, the application basically takes your previous question and model’s answer, appends it to your second query and sends it to the agent as a distinct second query. the longer you chat with the model, the longer each query becomes as it is supposed to contain entire history of your chat.

Ever call up your credit card company ? You first try to navigate through the automated voice prompts. Eventually you find a human. You explain the problem to the human who says, I am sorry,. This is not my department. But I will redirect you to the right department. Another human comes on the line and you have to explain the entire problem again. LLMs are exactly like that. Every prompt secretly has all the previous prompts provided hence you feel you are having a continuous chat but in reality each question is totally independent of another.

How to achieve this is called “Context Management”.

MCP is one way to do this and is an open standard proposed by Anthropic thought many other contribute to it now.

Besides context management MCP also plays a role in allowing LLMs to access other tools. For example. When you ask an LLM to give you current weather forecast, LLM itself has no way to call some real time weather API. LLMs can not access internet on their own. However, behind the scenes, LLMs is provided the context that there are certain tools available for LLM if it wanted to use them. LLM then figures that weather API might be needed and passes this information to the MCP client. MCP client can connect to MCP server and fetch the real time weather and provide it to LLM to process. LLM then formats the data and uses it in the output.

LLM thus itself does not execute any API but the MCP client and sever allows it to execute and use its output.

The following image has been taken from MCP’s official website.

Host with MCP client is the application that is acting as wrapper around LLMs.

MCP is a good topic that every AI develop needs to know. We have created a video here for you to understand it more deeply.