Llama-Index: Building an Agent from Scratch

TeeTracker
5 min readFeb 12, 2024

--

Using a simple loop to construct the most basic implementation of an Agent, colab. The purpose of doing this is to open the “blind box” so that every time you use the Agent, you have a general idea of what’s going on behind the scenes.

If you’re interested in the implementation of LangChain, you can refer to this article:

Agent logical

Agent loop with Llama-Index

Summarize the differences with LangChain. Here are some details of what I have implemented so far. These implementations do not represent 100% correctness, at least as a reference:

LangChain sends a query through a template that includes the input and chat history. The client assigns the input and history, and every time the input and history are submitted together. On the Llama-Index side, they are separate. The user’s input is stored in the history, and the feedback from the AI and function (tool) is also stored in the history after use. Each query submitted only contains history.

Repeat this process until the model no longer requests the use of the function (tool).

Some differences

Compared to LangChain, Llama-Index lacks support for some basic “classes”.

  • No AgentActionMessageLog and OpenAIFunctionsAgentOutputParser. To “unpack” the message returned by the model, query “tool_calls” to determine if it exists, this means the signal of the ending of this agent is not with AgentFinish.
# Have
ChatResponse(
│ message=ChatMessage(
│ │ role=<MessageRole.ASSISTANT: 'assistant'>,
│ │ content=None,
│ │ additional_kwargs={
│ │ │ 'tool_calls': [
│ │ │ │ ChatCompletionMessageToolCall(
│ │ │ │ │ id='call_2dhMLNGhW8EFYxKoUyNRzefl',
│ │ │ │ │ function=Function(arguments='{"a":12,"b":34}', name='int_mult'),
│ │ │ │ │ type='function'
│ │ │ │ )
│ │ │ ]
│ │ }
│ ),.....
)
# Do not have
ChatResponse(
│ message=ChatMessage(
│ │ role=<MessageRole.ASSISTANT: 'assistant'>,
│ │ content='Hello! How can I assist you today?',
│ │ additional_kwargs={}
│ ),.....
)
# Use a helper to extract the existing tool calls.
def should_continue(self, chat_response: ChatResponse):
return chat_response.message.additional_kwargs.get("tool_calls", None) is not None

When the “tool_call” is not present, the Agent stops.

  • There is no format_to_openai_functions in the Llama-Index framework. In the Llama-Index framework, different messages from different roles are enumerated using MessageRole, such as user, assistant, tool, or others. Therefore, the chat history is simply an array of ChatMessage.
    It’s worth noting that whether it’s a message sent by a human to the model, a message returned from the model, or the output of a local function completing its mission, all of these need to become ChatMessages.
    The only thing sent was a chat history model.

def run_tool(self, tool_call: ChatCompletionMessageToolCall) -> ChatMessage:
logger.info("Running tool")
func_id: str = tool_call.id
func_name: str = tool_call.function.name
args_json: str = tool_call.function.arguments

func: FunctionTool = self.func_tools_dict.get(func_name)
res: ToolOutput = func(**json.loads(args_json))
pprint(res)
return ChatMessage(
role = MessageRole.TOOL,
content=str(res),
name=func_name,
additional_kwargs={
"tool_call_id": func_id,
"name": func_name,
}
)
.....
chat_response: ChatResponse = self.model.chat(messages, tools=self.openai_tools)
func_message = self.run_tool(ai_message.additional_kwargs["tool_calls"][0])
.....

This is a code snippet, mainly parsing the message ChatCompletionMessageToolCall returned by the model. This message is sent to the client as its content when “tool_call” exists, and there may be multiple, but usually the first one is used.

  • There’s no @tool, but we can use FunctionTool.from_defaults(). Use this method in Llama-Index to turn a Python method into a function tool (FunctionTool).
def int_mult(a: int, b: int) -> int:
"""Apply a * b and returns the result as int"""
res = a * b
logger.debug(f"Apply {a} * {b} and returns {res}")

return res

def int_add(a: int, b: int) -> int:
"""Apply a + b and returns the result as int"""
res = a + b
logger.debug(f"Apply {a} + {b} and returns {res}")

return res

int_mult_tool, int_add_tool = FunctionTool.from_defaults(fn=int_mult), FunctionTool.from_defaults(fn=int_add)
int_tools: Sequence[BaseTool] = [int_mult_tool, int_add_tool]

pprint(int_tools)

Output:
[
│ <llama_index.tools.function_tool.FunctionTool object at 0x7ebca56b0220>,
│ <llama_index.tools.function_tool.FunctionTool object at 0x7ebca5667d90>
]
  • We can call FunctionTool directly, this is quite similar to the LangChain.
int_mult_tool(**{"a":12,"b":34})

Output:

ToolOutput(content='408', tool_name='int_mult', raw_input={'args': (), 'kwargs': {'a': 12, 'b': 34}}, raw_output=408)
  • Similar to LangChain’s convert_to_openai_function, we can use the tool.metadata.to_openai_tool() to convert the Llama-Index FunctionTool into OpenAI format when querying the model.
int_mult_tool, int_add_tool = FunctionTool.from_defaults(fn=int_mult), FunctionTool.from_defaults(fn=int_add)
int_tools: Sequence[BaseTool] = [int_mult_tool, int_add_tool]

........

openai_tools: Dict[str, Any] = [
tool.metadata.to_openai_tool() for tool in int_tools
]
pprint(openai_tools)

Output:
[
│ {
│ │ 'type': 'function',
│ │ 'function': {
│ │ │ 'name': 'int_mult',
│ │ │ 'description': 'int_mult(a: int, b: int) -> int\nApply a * b and returns the result as int',
│ │ │ 'parameters': {
│ │ │ │ 'type': 'object',
│ │ │ │ 'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}},
│ │ │ │ 'required': ['a', 'b']
│ │ │ }
│ │ }
│ },
│ {
│ │ 'type': 'function',
│ │ 'function': {
│ │ │ 'name': 'int_add',
│ │ │ 'description': 'int_add(a: int, b: int) -> int\nApply a + b and returns the result as int',
│ │ │ 'parameters': {
│ │ │ │ 'type': 'object',
│ │ │ │ 'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}},
│ │ │ │ 'required': ['a', 'b']
│ │ │ }
│ │ }
│ }
]
  • Similar to LangChain’s AgentActionMessageLog, we cannot invoke the FunctionTool on any executable field of ChatMessage or ChatCompletionMessageToolCall that the model returns. We will define a Dict where the key is the name of the function and the value is the function itself. By identifying the name of the tool, we can locate the function we need to call, and then pass the arguments returned from the message by converting them from JSON to Dict.
 func_id: str = tool_call.id
func_name: str = tool_call.function.name
args_json: str = tool_call.function.arguments

func: FunctionTool = self.func_tools_dict.get(func_name)
res: ToolOutput = func(**json.loads(args_json))

For more details, check the colab.

OpenAIAgent

Llama-Index provides a built-in Agent that directly hits OpenAI and returns the result as AgentChatResponse, which contains the results of all ToolOutput.

from llama_index.agent import OpenAIAgent

func_tools: Sequence[BaseTool] = list(map(lambda f: FunctionTool.from_defaults(fn=f), [random_joke, translate] ))
openai_agent = OpenAIAgent.from_tools(
tools=func_tools,
llm=model,
verbose=True
)
res = openai_agent.chat(human_input)
print("\nFinal result")
pprint(res)

Output:

AgentChatResponse(
│ response='为什么稻草人赢得了奖项?因为他在他的领域非常出色!',
│ sources=[
│ │ ToolOutput(
│ │ │ content="Sure, I'll give it a try! Here's a random joke for you:\n\nWhy did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
│ │ │ tool_name='random_joke',
│ │ │ raw_input={'args': (), 'kwargs': {}},
│ │ │ raw_output="Sure, I'll give it a try! Here's a random joke for you:\n\nWhy did the scarecrow win an award?\n\nBecause he was outstanding in his field!"
│ │ ),
│ │ ToolOutput(
│ │ │ content='为什么稻草人赢得了奖项?因为他在他的领域非常出色!',
│ │ │ tool_name='translate',
│ │ │ raw_input={
│ │ │ │ 'args': (),
│ │ │ │ 'kwargs': {
│ │ │ │ │ 'string': 'Why did the scarecrow win an award? Because he was outstanding in his field!'
│ │ │ │ }
│ │ │ },
│ │ │ raw_output='为什么稻草人赢得了奖项?因为他在他的领域非常出色!'
│ │ )
│ ],
│ source_nodes=[]
)

Code

Colab

P.S

I feel like manually implementing an Agent once can help me learn the basics of frameworks, LangChain, and Llama-Index, and strengthen my understanding of the principles behind Agent’s operation. I recommend this kind of learning and growth approach.

--

--

TeeTracker
TeeTracker

Responses (1)