Llama-Index: RAG with Vector and Summary by using Agent
2 min readFeb 8, 2024
Use the Agent mechanism to let LLM autonomously schedule the use of the document’s own content or summary.
This content is inherited from:
But instead of the combination mode, it’s replaced with the Agent mechanism, as everyone knows, there’s no human intervention.
- The source document is indexed as usual, and in this case, we opt for sentence splitting with a fixed window size.
- There are summary and vector indexes.
- From indices, we obtain query engines, respectively.
- Bind all search engines to the agents as tools; the agent will choose a tool based on the overall prompts.
- Query on the agent.
summary_query_engine: BaseQueryEngine = from_retriever_to_query_engine(
service_context=service_context,
retriever=DocumentSummaryIndexLLMRetriever(
summary_index,
similarity_top_k=SIM_TOP_K,
),
)
vector_query_engine: BaseQueryEngine = from_retriever_to_query_engine(
service_context=service_context,
retriever=vector_index.as_retriever(
similarity_top_k=SIM_TOP_K
),
)
query_engine_tools = [
QueryEngineTool(
query_engine=summary_query_engine,
metadata=ToolMetadata(
name="summary_tool",
description=f"Useful for summarization questions on the `{FILE_NAME}` document",
),
),
QueryEngineTool(
query_engine=vector_query_engine,
metadata=ToolMetadata(
name="vector_tool",
description=f"Useful for questions related to specific facts in the `{FILE_NAME}` document",
),
),
]
agent: BaseAgent = OpenAIAgent.from_tools(
query_engine_tools,
llm=llm,
verbose=True,
system_prompt=f"""\
You are a specialized agent designed to answer queries about the `{FILE_NAME}` document.
You must ALWAYS use at least ONE of the tools provided when answering a question; do NOT rely on prior knowledge.\
""",
)
The creation of indics
summary_index: BaseIndex = DocumentSummaryIndex.from_documents(
docs,
storage_context=storage_context,
service_context=service_context,
show_progress=True,
)
vector_index: BaseIndex = VectorStoreIndex.from_documents(
docs,
service_context=service_context,
storage_context=storage_context,
show_progress=True,
)