Agentic RAG: Router Engine

PreviousLlamaIndex Advanced NextAgentic RAG: Tool Calling

Last updated 1 year ago

Agentic RAG: Router Engine

기존 LlamaIndex RAG 파이프라인 위에 에이전트를 구축하여 자동화된 의사 결정 기능을 강화할 수 있습니다. Routing, Query Transformation, 등 많은 모듈이 의사 결정에 LLM을 사용한다는 점에서 이미 에이전트적 성격을 띠고 있습니다.

Agentic Strategy

Routing: 라우터는 사용자 쿼리와 일련의 '선택 사항'(메타데이터로 정의됨)을 받아 하나 이상의 선택 사항을 반환하는 모듈
Query Transformations: 쿼리 변환은 쿼리를 다른 쿼리로 변환하는 모듈로 인덱스에 대해 쿼리가 실행되기 전에 변환이 한 번 실행하는 단일 단계
Sub Question Query Engine: 먼저 복잡한 쿼리를 각 관련 데이터 원본에 대한 하위 질문으로 세분화한 다음 모든 중간 응답을 수집하고 최종 응답을 합성

Agentic 구축 절차

몇 개의 문서에 대한 간단한 쿼리에 적합한 표준 RAG 파이프라인과 달리, 이 지능형 접근 방식은 초기 결과에 따라 적응하여 추가 데이터 검색을 향상시킬 수 있습니다.

가장 간단한 형태의 에이전트 RAG인 Router를 구축
쿼리가 주어지면 Router는 Q&A 또는 요약이라는 두 가지 쿼리 엔진 중 하나를 선택하여 단일 문서에 대한 쿼리를 실행
Router 에이전트에 도구 호출을 추가하여 실행할 함수를 선택할 뿐만 아니라 함수에 전달할 인수를 유추할 수 있는 LLM을 사용
Research Assitant 에이전트를 구축
에이전트는 단발성 도구 호출 대신 여러 단계에 걸쳐 도구를 추론 가능
여러 문서를 처리하도록 리서치 에이전트를 확장하는 방법을 배울 수 있는 다중 문서 에이전트를 구축

LlamaIndex Agentic RAG: Router Engine

Setup Environment

import os
import nest_asyncio
from dotenv import load_dotenv, find_dotenv
                                                                                                                 # the format for that file is (without the comment)                                                                                                                                       #API_KEYNAME=AStringThatIsTheLongAPIKeyFromSomeService                                                                                                                                     
def load_env():
    _ = load_dotenv(find_dotenv())

def get_openai_api_key():
    load_env()
    openai_api_key = os.getenv("OPENAI_API_KEY")
    return openai_api_key

OPENAI_API_KEY = get_openai_api_key()
nest_asyncio.apply()

Load Data

!mkdir dataset
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O 'dataset/metagpt.pdf'

mkdir: cannot create directory ‘dataset’: File exists
--2024-05-11 15:01:46--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘dataset/metagpt.pdf’

dataset/metagpt.pdf 100%[===================>]  16.13M  8.03MB/s    in 2.0s    

2024-05-11 15:01:49 (8.03 MB/s) - ‘dataset/metagpt.pdf’ saved [16911937/16911937]

from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(
    input_files=["dataset/metagpt.pdf"]).load_data()

Define LLM & Embedding

from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

Define Summary & Vector Index

from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

Define Query Engines & Set Metadata

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

Define Router Query Engine

from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

response = query_engine.query(
    "문서를 요약 해줄래?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: Useful for summarization questions related to MetaGPT.
[0mMetaGPT framework introduces a meta-programming approach for multi-agent collaboration using Large Language Models (LLMs) and Standardized Operating Procedures (SOPs). It assigns specific roles to agents, streamlines workflows, and improves communication efficiency. By incorporating role specialization, structured communication, and an executable feedback mechanism, MetaGPT achieves state-of-the-art performance in code generation tasks. The framework's design focuses on enhancing problem-solving capabilities in multi-agent systems, particularly in code generation tasks, by managing roles, workflows, and communication effectively. It also emphasizes the potential of human-inspired techniques for artificial multi-agent systems.

print(len(response.source_nodes))

response = query_engine.query(
    "Agent는 다른 Agent들과 정보를 어떻게 공유해?"
)
print(str(response))

[1;3;38;5;200mSelecting query engine 1: This choice is more relevant as it pertains to retrieving specific context from the MetaGPT paper, which would likely contain information on how agents share information..
[0mAgents는 정보를 공유하기 위해 전역 메시지 풀에 정보를 저장하고 다른 에이전트들이 이 정보에 직접 액세스할 수 있도록 합니다. 또한 구독 메커니즘을 사용하여 역할별 관심사를 기반으로 관련 정보를 추출하고 필요한 정보를 선택하여 따르게 됩니다.

Full Code

from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector



def get_router_query_engine(file_path: str, llm = None, embed_model = None):
    """Get router query engine."""
    llm = llm or OpenAI(model="gpt-3.5-turbo")
    embed_model = embed_model or OpenAIEmbedding(model="text-embedding-ada-002")
    
    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    
    summary_index = SummaryIndex(nodes)
    vector_index = VectorStoreIndex(nodes, embed_model=embed_model)
    
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
        llm=llm
    )
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    
    summary_tool = QueryEngineTool.from_defaults(
        query_engine=summary_query_engine,
        description=(
            "Useful for summarization questions related to MetaGPT"
        ),
    )
    
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=(
            "Useful for retrieving specific context from the MetaGPT paper."
        ),
    )
    
    query_engine = RouterQueryEngine(
        selector=LLMSingleSelector.from_defaults(),
        query_engine_tools=[
            summary_tool,
            vector_tool,
        ],
        verbose=True
    )
    return query_engine

query_engine = get_router_query_engine("dataset/metagpt.pdf")

response = query_engine.query("ablation study 결과에 대해 알려줄래?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question is asking for information about the ablation study results, which is specific context from the MetaGPT paper..
[0mDifferent roles were analyzed in the ablation study to understand their impact on the final results. The study showed that adding roles beyond just the Engineer consistently improved both revisions and executability. While the addition of more roles slightly increased expenses, it significantly enhanced overall performance, highlighting the effectiveness of incorporating various roles in the framework.