LangChain with RAGAS

RAGAS는 LLM의 Generative와 Retrieval의 Benchmark와 Evaluate Metric을 지원하는 라이브러리 입니다.

RAG Evaluation with RAGAS 페이지를 참고하시기 바랍니다.

LangChain LCEL로 구현한 RAG 파이프라인을 RAGAS의 평가항목으로 결과를 구현하고, 이를 시각화 시켜보겠습니다. 절차는 아래와 같습니다.

환경설정
데이터셋 다운로드
Testset Generation
RAG 파이프라인
RAGAS Evaluation
결과 시각화

LangChain: Evaluate with ragas

from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

Download & Load Dataset

!mkdir data
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/food.txt -p ./data/food.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/founder.txt -p ./data/founder.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/restaurant.txt -p ./data/restaurant.txt

mkdir: cannot create directory ‘data’: File exists
Downloaded: 1 files, 149K in 0.01s (14.6 MB/s)

%pip install unstructured

loader = DirectoryLoader("./data", glob="**/*.txt")
docs = loader.load()

Chunking

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=350,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)
chunks = text_splitter.split_documents(docs)

ragas를 사용하기 위하여 file_name을 딕셔너리 key로 지정

for document in chunks:
    document.metadata["file_name"] = document.metadata["source"]

Embedding & Testset Generation

%pip install ragas

from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI

embeddings = OpenAIEmbeddings()
model = ChatOpenAI()

generator = TestsetGenerator.from_langchain(
    embeddings=embeddings, generator_llm=model, critic_llm=model
)

testset = generator.generate_with_langchain_docs(
    chunks,
    test_size=8,
    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)

testset을 pandas 데이터프레임으로 출력해보겠습니다.

testset.to_pandas()

question

contexts

ground_truth

evolution_type

metadata

episode_done

What type of wine is Chianti and where is it f...

[Osso Buco; $20; Braised veal shanks with vege...

Chianti is a dry red wine from Tuscany, Italy.

simple

[{'source': 'data/food.txt', 'file_name': 'dat...

True

How does the squid ink harmonize with the crea...

[Elena was led to a table adorned with a simpl...

The squid ink harmonizes with the creamy rice ...

simple

[{'source': 'data/restaurant.txt', 'file_name'...

True

What is the philosophy of hospitality and its ...

[Philosophy of Hospitality]

The philosophy of hospitality emphasizes the w...

simple

[{'source': 'data/founder.txt', 'file_name': '...

True

How did Amico's exploration of Italy contribut...

[As he grew, so did his desire to explore beyo...

Amico's exploration of Italy allowed him to wo...

simple

[{'source': 'data/founder.txt', 'file_name': '...

True

Which dish costs $7 and includes small toasted...

[Carpaccio; $15; Thinly sliced raw beef with a...

Crostini

reasoning

[{'source': 'data/food.txt', 'file_name': 'dat...

True

How does Chef Amico's commitment reflect his d...

[Continuing the Legacy\n\nToday, Chef Amico st...

Chef Amico's commitment is reflected in his de...

multi_context

[{'source': 'data/founder.txt', 'file_name': '...

True

Chroma Vector Store

from langchain_openai.embeddings import OpenAIEmbeddings

from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI

embedding = OpenAIEmbeddings()
model = ChatOpenAI()

vectorstore = Chroma.from_documents(chunks, embedding)
retriever = vectorstore.as_retriever()

RAG Prompt & Chains

from langchain_core.prompts import PromptTemplate

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = PromptTemplate(
    template=template, input_variables=["context", "question"]
)

from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

Question & Ground_truth

questions = testset.to_pandas()["question"].to_list()
ground_truth = testset.to_pandas()["ground_truth"].to_list()

#import pandas as pd

#df = pd.read_csv("./questions_answers/qa.csv", delimiter=";")
#questions = df["question"].tolist()
#ground_truth = df["ground_truth"].tolist()

ground_truth

['Chianti is a dry red wine from Tuscany, Italy.',
 "The squid ink harmonizes with the creamy rice by adding rich flavors and telling the tale of Sicily's love affair with the sea.",
 'The philosophy of hospitality emphasizes the welcoming and respectful treatment of guests or strangers. It involves creating a space where individuals feel valued, comfortable, and accepted. This philosophy is significant in human interactions as it promotes empathy, understanding, and connection between people, fostering a sense of community and mutual respect.',
 "Amico's exploration of Italy allowed him to work alongside renowned chefs in different regions, where he learned diverse regional flavors, techniques, and traditions of Italian cuisine. From the rolling hills of Tuscany to the romantic canals of Venice, each experience added to his knowledge and influenced his culinary skills.",
 'Crostini',
 "Chef Amico's commitment is reflected in his dedication to Sicilian flavors and traditions through his mentorship of young chefs, sharing knowledge at culinary workshops, and supporting local farmers and producers. His spirit of generosity and passion for food extends beyond the restaurant's walls, showcasing his deep connection to Sicilian culinary heritage."]

from datasets import Dataset

data = {"question": [], "answer": [], "contexts": [], "ground_truth": ground_truth}

for query in questions:
    data["question"].append(query)
    data["answer"].append(rag_chain.invoke(query))
    data["contexts"].append(
        [doc.page_content for doc in retriever.get_relevant_documents(query)]
    )

dataset = Dataset.from_dict(data)

first_entry = {
    "question": data["question"][0],
    "answer": data["answer"][0],
    "contexts": data["contexts"][0],
    "ground_truth": data["ground_truth"][0],
}
first_entry

{'question': 'What type of wine is Chianti and where is it from?',
 'answer': 'Chianti is a dry red wine from Tuscany.',
 'contexts': ['Elena was led to a table adorned with a simple, elegant setting. The first course was Caponata, a melody of eggplant, capers, and sweet tomatoes, which danced on her palate. Next came the Risotto al Nero di Seppia, a dish that told the tale of Sicily’s love affair with the sea. Each spoonful was a revelation, the rich flavors of squid ink',
  'As he grew, so did his desire to explore beyond the shores of Sicily. Venturing through Italy, Amico worked alongside renowned chefs, each teaching him a new facet of Italian cuisine. From the rolling hills of Tuscany to the romantic canals of Venice, he absorbed the diverse regional flavors, techniques, and traditions that would later influence',
  'Osso Buco; $20; Braised veal shanks with vegetables and broth; Main Dish\n\nRavioli; $13; Stuffed pasta with cheese or meat filling; Main Dish\n\nMinestrone Soup; $9; Vegetable soup with pasta or rice; Soup\n\nProsecco; $8; Italian sparkling white wine; Drink\n\nChianti; $10; Dry red wine from Tuscany; Drink',
  'Branzino; $21; Mediterranean sea bass, usually grilled or baked; Main Dish\n\nPorchetta; $18; Savory, fatty, and moist boneless pork roast; Main Dish\n\nMontepulciano Wine; $12; Full-bodied red wine; Drink\n\nBresaola; $14; Air-dried, salted beef served as an appetizer; Appetizer\n\nPesto Pasta; $12; Pasta with traditional basil pesto sauce; Main Dish'],
 'ground_truth': 'Chianti is a dry red wine from Tuscany, Italy.'}

Evaluate with ragas

context_relevancy
context_precision
context_recall
faithfulness
answer_relevancy

from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_relevancy,
    context_recall,
    context_precision,
)

result = evaluate(
    dataset=dataset,
    metrics=[
        context_relevancy,
        context_precision,
        context_recall,
        faithfulness,
        answer_relevancy,
    ],
)

Evaluating: 100%|██████████| 30/30 [00:09<00:00,  3.33it/s]

result.to_pandas().head()

question

answer

contexts

ground_truth

context_relevancy

context_precision

context_recall

faithfulness

answer_relevancy

What type of wine is Chianti and where is it f...

Chianti is a dry red wine from Tuscany.

[Elena was led to a table adorned with a simpl...

Chianti is a dry red wine from Tuscany, Italy.

0.058824

1.000000

0.5

1.0

0.961393

How does the squid ink harmonize with the crea...

The squid ink harmonizes with the creamy rice ...

[of squid ink harmonizing with the creamy rice...

The squid ink harmonizes with the creamy rice ...

0.000000

1.000000

1.0

0.959438

What is the philosophy of hospitality and its ...

The philosophy of hospitality, as exemplified ...

[Philosophy of Hospitality, For Amico, hospita...

The philosophy of hospitality emphasizes the w...

0.545455

0.916667

1.0

0.892042

How did Amico's exploration of Italy contribut...

Amico's exploration of Italy allowed him to wo...

[As he grew, so did his desire to explore beyo...

Amico's exploration of Italy allowed him to wo...

0.000000

1.000000

1.0

0.924004

Which dish costs $7 and includes small toasted...

Crostini

[Focaccia; $6; Oven-baked Italian bread; Side ...

Crostini

0.052632

0.500000

1.0

0.804980

Result Visulization

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

df = result.to_pandas()

heatmap_data = df[
    [
        "context_relevancy",
        "context_precision",
        "context_recall",
        "faithfulness",
        "answer_relevancy",
    ]
]

cmap = LinearSegmentedColormap.from_list("green_red", ["red", "green"])

plt.figure(figsize=(10, 8))
sns.heatmap(heatmap_data, annot=True, fmt=".2f", linewidths=0.5, cmap=cmap)

plt.yticks(ticks=range(len(df["question"])), labels=df["question"], rotation=0)

plt.show()

PreviousRAG Evaluation with RAGAS NextRAG Paradigms

Last updated 1 year ago

LangChain with RAGAS

RAGAS는 LLM의 Generative와 Retrieval의 Benchmark와 Evaluate Metric을 지원하는 라이브러리 입니다.

RAG Evaluation with RAGAS 페이지를 참고하시기 바랍니다.

LangChain LCEL로 구현한 RAG 파이프라인을 RAGAS의 평가항목으로 결과를 구현하고, 이를 시각화 시켜보겠습니다. 절차는 아래와 같습니다.

환경설정
데이터셋 다운로드
Testset Generation
RAG 파이프라인
RAGAS Evaluation
결과 시각화

LangChain: Evaluate with ragas

from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

Download & Load Dataset

!mkdir data
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/food.txt -p ./data/food.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/founder.txt -p ./data/founder.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/restaurant.txt -p ./data/restaurant.txt

mkdir: cannot create directory ‘data’: File exists
Downloaded: 1 files, 149K in 0.01s (14.6 MB/s)

%pip install unstructured

loader = DirectoryLoader("./data", glob="**/*.txt")
docs = loader.load()

Chunking

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=350,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)
chunks = text_splitter.split_documents(docs)

ragas를 사용하기 위하여 file_name을 딕셔너리 key로 지정

for document in chunks:
    document.metadata["file_name"] = document.metadata["source"]

Embedding & Testset Generation

%pip install ragas

from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI

embeddings = OpenAIEmbeddings()
model = ChatOpenAI()

generator = TestsetGenerator.from_langchain(
    embeddings=embeddings, generator_llm=model, critic_llm=model
)

testset = generator.generate_with_langchain_docs(
    chunks,
    test_size=8,
    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)

testset을 pandas 데이터프레임으로 출력해보겠습니다.

testset.to_pandas()

question

contexts

ground_truth

evolution_type

metadata

episode_done

What type of wine is Chianti and where is it f...

[Osso Buco; $20; Braised veal shanks with vege...

Chianti is a dry red wine from Tuscany, Italy.

simple

[{'source': 'data/food.txt', 'file_name': 'dat...

True

How does the squid ink harmonize with the crea...

[Elena was led to a table adorned with a simpl...

The squid ink harmonizes with the creamy rice ...

simple

[{'source': 'data/restaurant.txt', 'file_name'...

True

What is the philosophy of hospitality and its ...

[Philosophy of Hospitality]

The philosophy of hospitality emphasizes the w...

simple

[{'source': 'data/founder.txt', 'file_name': '...

True

How did Amico's exploration of Italy contribut...

[As he grew, so did his desire to explore beyo...

Amico's exploration of Italy allowed him to wo...

simple

[{'source': 'data/founder.txt', 'file_name': '...

True

Which dish costs $7 and includes small toasted...

[Carpaccio; $15; Thinly sliced raw beef with a...

Crostini

reasoning

[{'source': 'data/food.txt', 'file_name': 'dat...

True

How does Chef Amico's commitment reflect his d...

[Continuing the Legacy\n\nToday, Chef Amico st...

Chef Amico's commitment is reflected in his de...

multi_context

[{'source': 'data/founder.txt', 'file_name': '...

True

Chroma Vector Store

from langchain_openai.embeddings import OpenAIEmbeddings

from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI

embedding = OpenAIEmbeddings()
model = ChatOpenAI()

vectorstore = Chroma.from_documents(chunks, embedding)
retriever = vectorstore.as_retriever()

RAG Prompt & Chains

from langchain_core.prompts import PromptTemplate

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = PromptTemplate(
    template=template, input_variables=["context", "question"]
)

from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

Question & Ground_truth

questions = testset.to_pandas()["question"].to_list()
ground_truth = testset.to_pandas()["ground_truth"].to_list()

#import pandas as pd

#df = pd.read_csv("./questions_answers/qa.csv", delimiter=";")
#questions = df["question"].tolist()
#ground_truth = df["ground_truth"].tolist()

ground_truth

['Chianti is a dry red wine from Tuscany, Italy.',
 "The squid ink harmonizes with the creamy rice by adding rich flavors and telling the tale of Sicily's love affair with the sea.",
 'The philosophy of hospitality emphasizes the welcoming and respectful treatment of guests or strangers. It involves creating a space where individuals feel valued, comfortable, and accepted. This philosophy is significant in human interactions as it promotes empathy, understanding, and connection between people, fostering a sense of community and mutual respect.',
 "Amico's exploration of Italy allowed him to work alongside renowned chefs in different regions, where he learned diverse regional flavors, techniques, and traditions of Italian cuisine. From the rolling hills of Tuscany to the romantic canals of Venice, each experience added to his knowledge and influenced his culinary skills.",
 'Crostini',
 "Chef Amico's commitment is reflected in his dedication to Sicilian flavors and traditions through his mentorship of young chefs, sharing knowledge at culinary workshops, and supporting local farmers and producers. His spirit of generosity and passion for food extends beyond the restaurant's walls, showcasing his deep connection to Sicilian culinary heritage."]

from datasets import Dataset

data = {"question": [], "answer": [], "contexts": [], "ground_truth": ground_truth}

for query in questions:
    data["question"].append(query)
    data["answer"].append(rag_chain.invoke(query))
    data["contexts"].append(
        [doc.page_content for doc in retriever.get_relevant_documents(query)]
    )

dataset = Dataset.from_dict(data)

first_entry = {
    "question": data["question"][0],
    "answer": data["answer"][0],
    "contexts": data["contexts"][0],
    "ground_truth": data["ground_truth"][0],
}
first_entry

{'question': 'What type of wine is Chianti and where is it from?',
 'answer': 'Chianti is a dry red wine from Tuscany.',
 'contexts': ['Elena was led to a table adorned with a simple, elegant setting. The first course was Caponata, a melody of eggplant, capers, and sweet tomatoes, which danced on her palate. Next came the Risotto al Nero di Seppia, a dish that told the tale of Sicily’s love affair with the sea. Each spoonful was a revelation, the rich flavors of squid ink',
  'As he grew, so did his desire to explore beyond the shores of Sicily. Venturing through Italy, Amico worked alongside renowned chefs, each teaching him a new facet of Italian cuisine. From the rolling hills of Tuscany to the romantic canals of Venice, he absorbed the diverse regional flavors, techniques, and traditions that would later influence',
  'Osso Buco; $20; Braised veal shanks with vegetables and broth; Main Dish\n\nRavioli; $13; Stuffed pasta with cheese or meat filling; Main Dish\n\nMinestrone Soup; $9; Vegetable soup with pasta or rice; Soup\n\nProsecco; $8; Italian sparkling white wine; Drink\n\nChianti; $10; Dry red wine from Tuscany; Drink',
  'Branzino; $21; Mediterranean sea bass, usually grilled or baked; Main Dish\n\nPorchetta; $18; Savory, fatty, and moist boneless pork roast; Main Dish\n\nMontepulciano Wine; $12; Full-bodied red wine; Drink\n\nBresaola; $14; Air-dried, salted beef served as an appetizer; Appetizer\n\nPesto Pasta; $12; Pasta with traditional basil pesto sauce; Main Dish'],
 'ground_truth': 'Chianti is a dry red wine from Tuscany, Italy.'}

Evaluate with ragas

context_relevancy
context_precision
context_recall
faithfulness
answer_relevancy

from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_relevancy,
    context_recall,
    context_precision,
)

result = evaluate(
    dataset=dataset,
    metrics=[
        context_relevancy,
        context_precision,
        context_recall,
        faithfulness,
        answer_relevancy,
    ],
)

Evaluating: 100%|██████████| 30/30 [00:09<00:00,  3.33it/s]

result.to_pandas().head()

question

answer

contexts

ground_truth

context_relevancy

context_precision

context_recall

faithfulness

answer_relevancy

What type of wine is Chianti and where is it f...

Chianti is a dry red wine from Tuscany.

[Elena was led to a table adorned with a simpl...

Chianti is a dry red wine from Tuscany, Italy.

0.058824

1.000000

0.5

1.0

0.961393

How does the squid ink harmonize with the crea...

The squid ink harmonizes with the creamy rice ...

[of squid ink harmonizing with the creamy rice...

The squid ink harmonizes with the creamy rice ...

0.000000

1.000000

1.0

0.959438

What is the philosophy of hospitality and its ...

The philosophy of hospitality, as exemplified ...

[Philosophy of Hospitality, For Amico, hospita...

The philosophy of hospitality emphasizes the w...

0.545455

0.916667

1.0

0.892042

How did Amico's exploration of Italy contribut...

Amico's exploration of Italy allowed him to wo...

[As he grew, so did his desire to explore beyo...

Amico's exploration of Italy allowed him to wo...

0.000000

1.000000

1.0

0.924004

Which dish costs $7 and includes small toasted...

Crostini

[Focaccia; $6; Oven-baked Italian bread; Side ...

Crostini

0.052632

0.500000

1.0

0.804980

Result Visulization

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

df = result.to_pandas()

heatmap_data = df[
    [
        "context_relevancy",
        "context_precision",
        "context_recall",
        "faithfulness",
        "answer_relevancy",
    ]
]

cmap = LinearSegmentedColormap.from_list("green_red", ["red", "green"])

plt.figure(figsize=(10, 8))
sns.heatmap(heatmap_data, annot=True, fmt=".2f", linewidths=0.5, cmap=cmap)

plt.yticks(ticks=range(len(df["question"])), labels=df["question"], rotation=0)

plt.show()

PreviousRAG Evaluation with RAGAS NextRAG Paradigms

Last updated 1 year ago