AI-Master-Book
  • about AI-Master-Book
  • AI Master Book
    • 이상치 탐지 with Python
    • 베이지안 뉴럴네트워크 (BNN) with Python
    • 그래프 뉴럴네트워크 (GNN) with Python
    • 데이터 마케팅 분석 with Python
  • LLM MASTER BOOK
    • OpenAI API 쿡북 with Python
    • 기초부터 심화까지 RAG 쿡북 with Python
    • MCP 에이전트 쿡북 with Python
  • LLMs
    • OpenAI API
      • 1️⃣ChatCompletion
      • 2️⃣DALL-E
      • 3️⃣Text to Speech
      • 4️⃣Video to Transcripts
      • 5️⃣Assistants API
      • 6️⃣Prompt Engineering
      • 7️⃣OpenAI New GPT-4o
    • LangChain
      • LangChain Basic
        • 1️⃣Basic Modules
        • 2️⃣Model I/O
        • 3️⃣Prompts
        • 4️⃣Chains
        • 5️⃣Agents
        • 6️⃣Tools
        • 7️⃣Memory
      • LangChain Intermediate
        • 1️⃣OpenAI LLM
        • 2️⃣Prompt Template
        • 3️⃣Retrieval
        • 4️⃣RAG ChatBot
        • 5️⃣RAG with Gemini
        • 6️⃣New Huggingface-LangChain
        • 7️⃣Huggingface Hub
        • 8️⃣SQL Agent & Chain
        • 9️⃣Expression Language(LCEL)
        • 🔟Llama3-8B with LangChain
      • LangChain Advanced
        • 1️⃣LLM Evaluation
        • 2️⃣RAG Evaluation with RAGAS
        • 3️⃣LangChain with RAGAS
        • 4️⃣RAG Paradigms
        • 5️⃣LangChain: Advance Techniques
        • 6️⃣LangChain with NeMo-Guardrails
        • 7️⃣LangChain vs. LlamaIndex
        • 8️⃣LangChain LCEL vs. LangGraph
    • LlamaIndex
      • LlamaIndex Basic
        • 1️⃣Introduction
        • 2️⃣Customization
        • 3️⃣Data Connectors
        • 4️⃣Documents & Nodes
        • 5️⃣Naive RAG
        • 6️⃣Advanced RAG
        • 7️⃣Llama3-8B with LlamaIndex
        • 8️⃣LlmaPack
      • LlamaIndex Intermediate
        • 1️⃣QueryEngine
        • 2️⃣Agent
        • 3️⃣Evaluation
        • 4️⃣Evaluation-Driven Development
        • 5️⃣Fine-tuning
        • 6️⃣Prompt Compression with LLMLingua
      • LlamaIndex Advanced
        • 1️⃣Agentic RAG: Router Engine
        • 2️⃣Agentic RAG: Tool Calling
        • 3️⃣Building Agent Reasoning Loop
        • 4️⃣Building Multi-document Agent
    • Hugging Face
      • Huggingface Basic
        • 1️⃣Datasets
        • 2️⃣Tokenizer
        • 3️⃣Sentence Embeddings
        • 4️⃣Transformers
        • 5️⃣Sentence Transformers
        • 6️⃣Evaluate
        • 7️⃣Diffusers
      • Huggingface Tasks
        • NLP
          • 1️⃣Sentiment Analysis
          • 2️⃣Zero-shot Classification
          • 3️⃣Aspect-Based Sentiment Analysis
          • 4️⃣Feature Extraction
          • 5️⃣Intent Classification
          • 6️⃣Topic Modeling: BERTopic
          • 7️⃣NER: Token Classification
          • 8️⃣Summarization
          • 9️⃣Translation
          • 🔟Text Generation
        • Audio & Tabular
          • 1️⃣Text-to-Speech: TTS
          • 2️⃣Speech Recognition: Whisper
          • 3️⃣Audio Classification
          • 4️⃣Tabular Qustaion & Answering
        • Vision & Multimodal
          • 1️⃣Image-to-Text
          • 2️⃣Text to Image
          • 3️⃣Image to Image
          • 4️⃣Text or Image-to-Video
          • 5️⃣Depth Estimation
          • 6️⃣Image Classification
          • 7️⃣Object Detection
          • 8️⃣Segmentatio
      • Huggingface Optimization
        • 1️⃣Accelerator
        • 2️⃣Bitsandbytes
        • 3️⃣Flash Attention
        • 4️⃣Quantization
        • 5️⃣Safetensors
        • 6️⃣Optimum-ONNX
        • 7️⃣Optimum-NVIDIA
        • 8️⃣Optimum-Intel
      • Huggingface Fine-tuning
        • 1️⃣Transformer Fine-tuning
        • 2️⃣PEFT Fine-tuning
        • 3️⃣PEFT: Fine-tuning with QLoRA
        • 4️⃣PEFT: Fine-tuning Phi-2 with QLoRA
        • 5️⃣Axoltl Fine-tuning with QLoRA
        • 6️⃣TRL: RLHF Alignment Fine-tuning
        • 7️⃣TRL: DPO Fine-tuning with Phi-3-4k-instruct
        • 8️⃣TRL: ORPO Fine-tuning with Llama3-8B
        • 9️⃣Convert GGUF gemma-2b with llama.cpp
        • 🔟Apple Silicon Fine-tuning Gemma-2B with MLX
        • 🔢LLM Mergekit
    • Agentic LLM
      • Agentic LLM
        • 1️⃣Basic Agentic LLM
        • 2️⃣Multi-agent with CrewAI
        • 3️⃣LangGraph: Multi-agent Basic
        • 4️⃣LangGraph: Agentic RAG with LangChain
        • 5️⃣LangGraph: Agentic RAG with Llama3-8B by Groq
      • Autonomous Agent
        • 1️⃣LLM Autonomous Agent?
        • 2️⃣AutoGPT: Worldcup Winner Search with LangChain
        • 3️⃣BabyAGI: Weather Report with LangChain
        • 4️⃣AutoGen: Writing Blog Post with LangChain
        • 5️⃣LangChain: Autonomous-agent Debates with Tools
        • 6️⃣CAMEL Role-playing Autonomous Cooperative Agents
        • 7️⃣LangChain: Two-player Harry Potter D&D based CAMEL
        • 8️⃣LangChain: Multi-agent Bid for K-Pop Debate
        • 9️⃣LangChain: Multi-agent Authoritarian Speaker Selection
        • 🔟LangChain: Multi-Agent Simulated Environment with PettingZoo
    • Multimodal
      • 1️⃣PaliGemma: Open Vision LLM
      • 2️⃣FLUX.1: Generative Image
    • Building LLM
      • 1️⃣DSPy
      • 2️⃣DSPy RAG
      • 3️⃣DSPy with LangChain
      • 4️⃣Mamba
      • 5️⃣Mamba RAG with LangChain
      • 7️⃣PostgreSQL VectorDB with pgvorco.rs
Powered by GitBook
On this page
  • Agentic Strategy
  • Agentic 구축 절차
  • LlamaIndex Agentic RAG: Router Engine
  • Setup Environment
  • Load Data
  • Define LLM & Embedding
  • Define Summary & Vector Index
  • Define Query Engines & Set Metadata
  • Define Router Query Engine
  • Full Code
  1. LLMs
  2. LlamaIndex
  3. LlamaIndex Advanced

Agentic RAG: Router Engine

PreviousLlamaIndex AdvancedNextAgentic RAG: Tool Calling

Last updated 1 year ago

기존 LlamaIndex RAG 파이프라인 위에 에이전트를 구축하여 자동화된 의사 결정 기능을 강화할 수 있습니다. Routing, Query Transformation, 등 많은 모듈이 의사 결정에 LLM을 사용한다는 점에서 이미 에이전트적 성격을 띠고 있습니다.

Agentic Strategy

  • Routing: 라우터는 사용자 쿼리와 일련의 '선택 사항'(메타데이터로 정의됨)을 받아 하나 이상의 선택 사항을 반환하는 모듈

  • Query Transformations: 쿼리 변환은 쿼리를 다른 쿼리로 변환하는 모듈로 인덱스에 대해 쿼리가 실행되기 전에 변환이 한 번 실행하는 단일 단계

  • Sub Question Query Engine: 먼저 복잡한 쿼리를 각 관련 데이터 원본에 대한 하위 질문으로 세분화한 다음 모든 중간 응답을 수집하고 최종 응답을 합성

Agentic 구축 절차

몇 개의 문서에 대한 간단한 쿼리에 적합한 표준 RAG 파이프라인과 달리, 이 지능형 접근 방식은 초기 결과에 따라 적응하여 추가 데이터 검색을 향상시킬 수 있습니다.

  1. 가장 간단한 형태의 에이전트 RAG인 Router를 구축

  2. 쿼리가 주어지면 Router는 Q&A 또는 요약이라는 두 가지 쿼리 엔진 중 하나를 선택하여 단일 문서에 대한 쿼리를 실행

  3. Router 에이전트에 도구 호출을 추가하여 실행할 함수를 선택할 뿐만 아니라 함수에 전달할 인수를 유추할 수 있는 LLM을 사용

  4. Research Assitant 에이전트를 구축

  5. 에이전트는 단발성 도구 호출 대신 여러 단계에 걸쳐 도구를 추론 가능

  6. 여러 문서를 처리하도록 리서치 에이전트를 확장하는 방법을 배울 수 있는 다중 문서 에이전트를 구축


LlamaIndex Agentic RAG: Router Engine

Setup Environment

import os
import nest_asyncio
from dotenv import load_dotenv, find_dotenv
                                                                                                                 # the format for that file is (without the comment)                                                                                                                                       #API_KEYNAME=AStringThatIsTheLongAPIKeyFromSomeService                                                                                                                                     
def load_env():
    _ = load_dotenv(find_dotenv())

def get_openai_api_key():
    load_env()
    openai_api_key = os.getenv("OPENAI_API_KEY")
    return openai_api_key
OPENAI_API_KEY = get_openai_api_key()
nest_asyncio.apply()

Load Data

!mkdir dataset
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O 'dataset/metagpt.pdf'
mkdir: cannot create directory ‘dataset’: File exists
--2024-05-11 15:01:46--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘dataset/metagpt.pdf’

dataset/metagpt.pdf 100%[===================>]  16.13M  8.03MB/s    in 2.0s    

2024-05-11 15:01:49 (8.03 MB/s) - ‘dataset/metagpt.pdf’ saved [16911937/16911937]
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(
    input_files=["dataset/metagpt.pdf"]).load_data()

Define LLM & Embedding

from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

Define Summary & Vector Index

from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

Define Query Engines & Set Metadata

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()
from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

Define Router Query Engine

from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)
response = query_engine.query(
    "문서를 요약 해줄래?")
print(str(response))
Selecting query engine 0: Useful for summarization questions related to MetaGPT.
MetaGPT framework introduces a meta-programming approach for multi-agent collaboration using Large Language Models (LLMs) and Standardized Operating Procedures (SOPs). It assigns specific roles to agents, streamlines workflows, and improves communication efficiency. By incorporating role specialization, structured communication, and an executable feedback mechanism, MetaGPT achieves state-of-the-art performance in code generation tasks. The framework's design focuses on enhancing problem-solving capabilities in multi-agent systems, particularly in code generation tasks, by managing roles, workflows, and communication effectively. It also emphasizes the potential of human-inspired techniques for artificial multi-agent systems.
print(len(response.source_nodes))
34
response = query_engine.query(
    "Agent는 다른 Agent들과 정보를 어떻게 공유해?"
)
print(str(response))
Selecting query engine 1: This choice is more relevant as it pertains to retrieving specific context from the MetaGPT paper, which would likely contain information on how agents share information..
Agents는 정보를 공유하기 위해 전역 메시지 풀에 정보를 저장하고 다른 에이전트들이 이 정보에 직접 액세스할 수 있도록 합니다. 또한 구독 메커니즘을 사용하여 역할별 관심사를 기반으로 관련 정보를 추출하고 필요한 정보를 선택하여 따르게 됩니다.

Full Code

from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector



def get_router_query_engine(file_path: str, llm = None, embed_model = None):
    """Get router query engine."""
    llm = llm or OpenAI(model="gpt-3.5-turbo")
    embed_model = embed_model or OpenAIEmbedding(model="text-embedding-ada-002")
    
    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    
    summary_index = SummaryIndex(nodes)
    vector_index = VectorStoreIndex(nodes, embed_model=embed_model)
    
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
        llm=llm
    )
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    
    summary_tool = QueryEngineTool.from_defaults(
        query_engine=summary_query_engine,
        description=(
            "Useful for summarization questions related to MetaGPT"
        ),
    )
    
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=(
            "Useful for retrieving specific context from the MetaGPT paper."
        ),
    )
    
    query_engine = RouterQueryEngine(
        selector=LLMSingleSelector.from_defaults(),
        query_engine_tools=[
            summary_tool,
            vector_tool,
        ],
        verbose=True
    )
    return query_engine
query_engine = get_router_query_engine("dataset/metagpt.pdf")
response = query_engine.query("ablation study 결과에 대해 알려줄래?")
print(str(response))
Selecting query engine 1: The question is asking for information about the ablation study results, which is specific context from the MetaGPT paper..
Different roles were analyzed in the ablation study to understand their impact on the final results. The study showed that adding roles beyond just the Engineer consistently improved both revisions and executability. While the addition of more roles slightly increased expenses, it significantly enhanced overall performance, highlighting the effectiveness of incorporating various roles in the framework.
1️⃣
이미지 출처: DeepLearning.AI