Agentic RAG: Tool Calling

Tool Calling 이란?

표준 RAG에서 LLM은 주로 정보 합성에만 사용됩니다. 반면에 Tool Calling은 RAG 파이프라인 위에 쿼리 이해 계층을 추가하여 사용자가 복잡한 쿼리를 요청하고 보다 정확한 결과를 얻을 수 있도록 합니다.

이를 통해 LLM은 단순히 출력을 소비하는 대신 VectorDB를 사용하는 방법을 파악할 수 있습니다. Tool Calling으로 LLM은 동적 인터페이스를 통해 외부 환경과 상호 작용할 수 있습니다.

Tool Calling은 적절한 Tools를 선택할 뿐만 아니라 실행에 필요한 인수를 추론하는 데 도움을 줍니다. 따라서 표준 RAG에 비해 요청을 더 잘 이해하고 더 나은 응답을 생성할 수 있습니다.

Tool Calling의 절차

Agents는 단발성 도구(Tool) 호출 대신 여러 단계에 걸쳐 Tool를 추론 가능 합니다.

Tool 을 정의하고
Auto-Retrieval Tool을 정의하고
Query Engine Tool을 추가하여 Respone를 실행해 보겠습니다.

LlamaIndex Agentic RAG: Tool Calling

Setup Environments

import os
import nest_asyncio
from dotenv import load_dotenv, find_dotenv
                                                                                                                 # the format for that file is (without the comment)                                                                                                                                       #API_KEYNAME=AStringThatIsTheLongAPIKeyFromSomeService                                                                                                                                     
def load_env():
    _ = load_dotenv(find_dotenv())

def get_openai_api_key():
    load_env()
    openai_api_key = os.getenv("OPENAI_API_KEY")
    return openai_api_key

OPENAI_API_KEY = get_openai_api_key()
nest_asyncio.apply()

Define Simple Tool

from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """두 정수를 합산합니다."""
    return x + y

def mystery(x: int, y: int) -> int: 
    """두 개의 숫자 위에 작동하는 미스터리 함수(Mysery function)입니다."""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

response = llm.predict_and_call(
    [add_tool, mystery_tool], 
    "2와 9에 대한 미스터리 함수의 출력을 알려줄래?.", 
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121

Define an Auto-Retrieval Tool

Load Data

!mkdir dataset
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O 'dataset/metagpt.pdf'

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(input_files=["dataset/metagpt.pdf"]).load_data()

from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: dataset/metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-05-11
last_modified_date: 2024-05-11

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks, however, are complicated through logic inconsistencies due to
cascading hallucinations caused by naively chaining LLMs. Here we introduce
MetaGPT, an innovative meta-programming framework incorporating efficient
human workflows into LLM-based multi-agent collaborations. MetaGPT en-
codes Standardized Operating Procedures (SOPs) into prompt sequences for more
streamlined workflows, thus allowing agents with human-like domain expertise
to verify intermediate results and reduce errors. MetaGPT utilizes an assembly
line paradigm to assign diverse roles to various agents, efficiently breaking down
complex tasks into subtasks involving many agents working together. On col-
laborative software engineering benchmarks, MetaGPT generates more coherent
solutions than previous chat-based multi-agent systems. Our project can be found
at https://github.com/geekan/MetaGPT.
1 I NTRODUCTION
Autonomous agents utilizing Large Language Models (LLMs) offer promising opportunities to en-
hance and replicate human workflows. In real-world applications, however, existing systems (Park
et al., 2023; Zhuge et al., 2023; Cai et al., 2023; Wang et al., 2023c; Li et al., 2023; Du et al., 2023;
Liang et al., 2023; Hao et al., 2023) tend to oversimplify the complexities. They struggle to achieve
effective, coherent, and accurate problem-solving processes, particularly when there is a need for
meaningful collaborative interaction (Chen et al., 2024; Zhang et al., 2023; Dong et al., 2023; Zhou
et al., 2023; Qian et al., 2023).
Through extensive collaborative practice, humans have developed widely accepted Standardized
Operating Procedures (SOPs) across various domains (Belbin, 2012; Manifesto, 2001; DeMarco &
Lister, 2013). These SOPs play a critical role in supporting task decomposition and effective coor-
dination. Furthermore, SOPs outline the responsibilities of each team member, while establishing
standards for intermediate outputs. Well-defined SOPs improve the consistent and accurate exe-
cution of tasks that align with defined roles and quality standards (Belbin, 2012; Manifesto, 2001;
DeMarco & Lister, 2013; Wooldridge & Jennings, 1998). For instance, in a software company,
Product Managers analyze competition and user needs to create Product Requirements Documents
(PRDs) using a standardized structure, to guide the developmental process.
Inspired by such ideas, we design a promising GPT -based Meta -Programming framework called
MetaGPT that significantly benefits from SOPs. Unlike other works (Li et al., 2023; Qian et al.,
2023), MetaGPT requires agents to generate structured outputs, such as high-quality requirements
∗These authors contributed equally to this work.
†Chenglin Wu (alexanderwu@fuzhi.ai) is the corresponding author, affiliated with DeepWisdom.
1

from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "MetaGPT의 높은 수준의 결과는 무엇이야?", 
)

print(str(response))

MetaGPT의 높은 수준의 결과는 코드 생성 벤치마크에서 85.9%와 87.7%의 Pass@1을 달성하여 새로운 최고 수준의 성과를 보여주었으며, 다른 인기 있는 프레임워크들과 비교했을 때 높은 소프트웨어 복잡성을 다루고 광범위한 기능을 제공하는 것으로 나타났습니다. 실험적 평가에서 MetaGPT는 100%의 작업 완료율을 달성하여 설계의 견고성과 효율성을 입증했습니다.

for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'dataset/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-11', 'last_modified_date': '2024-05-11'}

Define Auto-Retrieval Tool

from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str, 
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.
    
    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.
    
    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]
    
    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response
    

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "2 페이지에 설명된 MetaGPT의 높은 수준의 결과는 무엇이야?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2"]}
=== Function Output ===
MetaGPT achieves a new state-of-the-art (SoTA) in code generation benchmarks with 85.9% and 87.7% in Pass@1. It stands out in handling higher levels of software complexity and offering extensive functionality, demonstrating robustness and efficiency in design with a 100% task completion rate in experimental evaluations.

for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'dataset/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-11', 'last_modified_date': '2024-05-11'}

Add Tool Function

QueryEngineTool

from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "MetaGPT에 대한 요약을 얻고자 할 때 유용하다."
    ),
)

response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "8 페이지에 설명된 MetaGPT의 높은 수준의 결과는 무엇이야??", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT achieves high-level results in software development tasks, outperforming ChatDev in various metrics such as executability, running times, token usage, code statistics, productivity, and human revision cost. It demonstrates strong capabilities in autonomous software generation and showcases the benefits of using Standard Operating Procedures (SOPs) in collaborative environments.

for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': 'dataset/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-11', 'last_modified_date': '2024-05-11'}

response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "논문의 요약은 무엇이야?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "The summary of the paper"}
=== Function Output ===
The paper introduces MetaGPT, a meta-programming framework that enhances multi-agent collaboration through the use of Standardized Operating Procedures (SOPs) and role specialization. It models agents as a simulated software company, emphasizing workflow management and efficient sharing mechanisms. MetaGPT demonstrates superior performance in code generation quality and problem-solving capabilities, outperforming previous approaches in various benchmarks. The framework leverages human-like SOPs to regulate LLM-based multi-agent systems effectively, showcasing the potential of human-inspired techniques in artificial intelligence. Additionally, MetaGPT enables software development teams to improve over time through active teamwork and self-improvement mechanisms based on recursive concepts. The paper also discusses the performance of different GPT models, ethical concerns, and the benefits of MetaGPT in terms of programming accessibility and transparency.

PreviousAgentic RAG: Router Engine NextBuilding Agent Reasoning Loop

Last updated 1 year ago