AI-Master-Book
  • about AI-Master-Book
  • AI Master Book
    • 이상치 탐지 with Python
    • 베이지안 뉴럴네트워크 (BNN) with Python
    • 그래프 뉴럴네트워크 (GNN) with Python
    • 데이터 마케팅 분석 with Python
  • LLM MASTER BOOK
    • OpenAI API 쿡북 with Python
    • 기초부터 심화까지 RAG 쿡북 with Python
    • MCP 에이전트 쿡북 with Python
  • LLMs
    • OpenAI API
      • 1️⃣ChatCompletion
      • 2️⃣DALL-E
      • 3️⃣Text to Speech
      • 4️⃣Video to Transcripts
      • 5️⃣Assistants API
      • 6️⃣Prompt Engineering
      • 7️⃣OpenAI New GPT-4o
    • LangChain
      • LangChain Basic
        • 1️⃣Basic Modules
        • 2️⃣Model I/O
        • 3️⃣Prompts
        • 4️⃣Chains
        • 5️⃣Agents
        • 6️⃣Tools
        • 7️⃣Memory
      • LangChain Intermediate
        • 1️⃣OpenAI LLM
        • 2️⃣Prompt Template
        • 3️⃣Retrieval
        • 4️⃣RAG ChatBot
        • 5️⃣RAG with Gemini
        • 6️⃣New Huggingface-LangChain
        • 7️⃣Huggingface Hub
        • 8️⃣SQL Agent & Chain
        • 9️⃣Expression Language(LCEL)
        • 🔟Llama3-8B with LangChain
      • LangChain Advanced
        • 1️⃣LLM Evaluation
        • 2️⃣RAG Evaluation with RAGAS
        • 3️⃣LangChain with RAGAS
        • 4️⃣RAG Paradigms
        • 5️⃣LangChain: Advance Techniques
        • 6️⃣LangChain with NeMo-Guardrails
        • 7️⃣LangChain vs. LlamaIndex
        • 8️⃣LangChain LCEL vs. LangGraph
    • LlamaIndex
      • LlamaIndex Basic
        • 1️⃣Introduction
        • 2️⃣Customization
        • 3️⃣Data Connectors
        • 4️⃣Documents & Nodes
        • 5️⃣Naive RAG
        • 6️⃣Advanced RAG
        • 7️⃣Llama3-8B with LlamaIndex
        • 8️⃣LlmaPack
      • LlamaIndex Intermediate
        • 1️⃣QueryEngine
        • 2️⃣Agent
        • 3️⃣Evaluation
        • 4️⃣Evaluation-Driven Development
        • 5️⃣Fine-tuning
        • 6️⃣Prompt Compression with LLMLingua
      • LlamaIndex Advanced
        • 1️⃣Agentic RAG: Router Engine
        • 2️⃣Agentic RAG: Tool Calling
        • 3️⃣Building Agent Reasoning Loop
        • 4️⃣Building Multi-document Agent
    • Hugging Face
      • Huggingface Basic
        • 1️⃣Datasets
        • 2️⃣Tokenizer
        • 3️⃣Sentence Embeddings
        • 4️⃣Transformers
        • 5️⃣Sentence Transformers
        • 6️⃣Evaluate
        • 7️⃣Diffusers
      • Huggingface Tasks
        • NLP
          • 1️⃣Sentiment Analysis
          • 2️⃣Zero-shot Classification
          • 3️⃣Aspect-Based Sentiment Analysis
          • 4️⃣Feature Extraction
          • 5️⃣Intent Classification
          • 6️⃣Topic Modeling: BERTopic
          • 7️⃣NER: Token Classification
          • 8️⃣Summarization
          • 9️⃣Translation
          • 🔟Text Generation
        • Audio & Tabular
          • 1️⃣Text-to-Speech: TTS
          • 2️⃣Speech Recognition: Whisper
          • 3️⃣Audio Classification
          • 4️⃣Tabular Qustaion & Answering
        • Vision & Multimodal
          • 1️⃣Image-to-Text
          • 2️⃣Text to Image
          • 3️⃣Image to Image
          • 4️⃣Text or Image-to-Video
          • 5️⃣Depth Estimation
          • 6️⃣Image Classification
          • 7️⃣Object Detection
          • 8️⃣Segmentatio
      • Huggingface Optimization
        • 1️⃣Accelerator
        • 2️⃣Bitsandbytes
        • 3️⃣Flash Attention
        • 4️⃣Quantization
        • 5️⃣Safetensors
        • 6️⃣Optimum-ONNX
        • 7️⃣Optimum-NVIDIA
        • 8️⃣Optimum-Intel
      • Huggingface Fine-tuning
        • 1️⃣Transformer Fine-tuning
        • 2️⃣PEFT Fine-tuning
        • 3️⃣PEFT: Fine-tuning with QLoRA
        • 4️⃣PEFT: Fine-tuning Phi-2 with QLoRA
        • 5️⃣Axoltl Fine-tuning with QLoRA
        • 6️⃣TRL: RLHF Alignment Fine-tuning
        • 7️⃣TRL: DPO Fine-tuning with Phi-3-4k-instruct
        • 8️⃣TRL: ORPO Fine-tuning with Llama3-8B
        • 9️⃣Convert GGUF gemma-2b with llama.cpp
        • 🔟Apple Silicon Fine-tuning Gemma-2B with MLX
        • 🔢LLM Mergekit
    • Agentic LLM
      • Agentic LLM
        • 1️⃣Basic Agentic LLM
        • 2️⃣Multi-agent with CrewAI
        • 3️⃣LangGraph: Multi-agent Basic
        • 4️⃣LangGraph: Agentic RAG with LangChain
        • 5️⃣LangGraph: Agentic RAG with Llama3-8B by Groq
      • Autonomous Agent
        • 1️⃣LLM Autonomous Agent?
        • 2️⃣AutoGPT: Worldcup Winner Search with LangChain
        • 3️⃣BabyAGI: Weather Report with LangChain
        • 4️⃣AutoGen: Writing Blog Post with LangChain
        • 5️⃣LangChain: Autonomous-agent Debates with Tools
        • 6️⃣CAMEL Role-playing Autonomous Cooperative Agents
        • 7️⃣LangChain: Two-player Harry Potter D&D based CAMEL
        • 8️⃣LangChain: Multi-agent Bid for K-Pop Debate
        • 9️⃣LangChain: Multi-agent Authoritarian Speaker Selection
        • 🔟LangChain: Multi-Agent Simulated Environment with PettingZoo
    • Multimodal
      • 1️⃣PaliGemma: Open Vision LLM
      • 2️⃣FLUX.1: Generative Image
    • Building LLM
      • 1️⃣DSPy
      • 2️⃣DSPy RAG
      • 3️⃣DSPy with LangChain
      • 4️⃣Mamba
      • 5️⃣Mamba RAG with LangChain
      • 7️⃣PostgreSQL VectorDB with pgvorco.rs
Powered by GitBook
On this page
  • Zero-shot Classification
  • Import Transformer
  • Tokenizer & Model
  • Pipeline
  • Inference
  • Single-label Classification
  1. LLMs
  2. Hugging Face
  3. Huggingface Tasks
  4. NLP

Zero-shot Classification

Zero-shot Classification

Zero-shot은 학습 중에 모델이 보지 못한 것을 예측하는 작업입니다. 사전 학습된 언어 모델을 활용하는 이 방법은 일반적으로 한 작업에 대해 학습된 모델을 원래 학습된 목적과 다른 애플리케이션에서 사용하는 것을 의미하는 전이 학습의 한 사례로 생각할 수 있습니다.

이 방법은 라벨링된 데이터의 양이 적은 상황에서 특히 유용합니다.제로 샷 분류에서는 모델에 자연어로 모델에 수행할 작업을 설명하는 프롬프트와 텍스트 시퀀스를 제공합니다. 제로 샷 분류에서는 원하는 작업이 완료된 예는 모두 제외됩니다. 이러한 작업에는 선택한 작업의 단일 또는 몇 가지 예가 포함되므로 단일 또는 소수 샷 분류와는 다릅니다.

Zero, Single, Few-shot 분류는 대규모 언어 모델에서 새롭게 등장한 기능인 것 같습니다. 이 기능은 모델 크기가 1억 개 이상의 매개변수일 때 나타나는 것으로 보입니다. 제로, 단일 또는 소수 샷 작업에서 모델의 효율성은 모델 크기에 따라 확장되는 것으로 보이며, 이는 일반적으로 더 큰 모델(학습 가능한 매개 변수 또는 레이어가 더 많은 모델)이 이 작업에서 더 잘 수행한다는 것을 의미합니다.

Import Transformer

# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Tokenizer & Model

tokenizer = AutoTokenizer.from_pretrained("MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
model = AutoModelForSequenceClassification.from_pretrained("MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
tokenizer_config.json: 100%|██████████| 1.28k/1.28k [00:00<00:00, 2.60MB/s]
spm.model: 100%|██████████| 2.46M/2.46M [00:01<00:00, 2.10MB/s]
tokenizer.json: 100%|██████████| 8.66M/8.66M [00:01<00:00, 6.38MB/s]
added_tokens.json: 100%|██████████| 23.0/23.0 [00:00<00:00, 49.3kB/s]
special_tokens_map.json: 100%|██████████| 286/286 [00:00<00:00, 804kB/s]
config.json: 100%|██████████| 1.09k/1.09k [00:00<00:00, 3.62MB/s]
model.safetensors: 100%|██████████| 369M/369M [00:21<00:00, 17.3MB/s] 

Pipeline

classifier = pipeline(
    "zero-shot-classification", 
    model=model, 
    tokenizer=tokenizer
)

Inference

Multi-label Classification

text = """
A group of astronomers gathered at the observatory to witness the rare celestial event. As they peered through the telescopes, they observed the alignment of several planets, creating a breathtaking view against the night sky. The event, which hadn't occurred for decades, attracted enthusiasts and experts alike, all eager to record and study this astronomical phenomenon.
"""

candidate_labels = ["astronomy","cooking","science","space","music"]
print(classifier(text, candidate_labels, multi_label=True))
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


{'sequence': "\nA group of astronomers gathered at the observatory to witness the rare celestial event. As they peered through the telescopes, they observed the alignment of several planets, creating a breathtaking view against the night sky. The event, which hadn't occurred for decades, attracted enthusiasts and experts alike, all eager to record and study this astronomical phenomenon.\n", 'labels': ['science', 'astronomy', 'space', 'music', 'cooking'], 'scores': [0.9984110593795776, 0.9979022145271301, 0.9921674728393555, 0.0003671070153359324, 0.00015695678303018212]}

Single-label Classification

print(classifier(text, candidate_labels, multi_label=False)["labels"])
print(classifier(text, candidate_labels, multi_label=False)["scores"])
['astronomy', 'science', 'space', 'music', 'cooking']
[0.4273662865161896, 0.4104445278644562, 0.1616421341896057, 0.00030228393734432757, 0.00024482482695020735]
PreviousSentiment AnalysisNextAspect-Based Sentiment Analysis

Last updated 1 year ago

2️⃣