AI-Master-Book
  • about AI-Master-Book
  • AI Master Book
    • 이상치 탐지 with Python
    • 베이지안 뉴럴네트워크 (BNN) with Python
    • 그래프 뉴럴네트워크 (GNN) with Python
    • 데이터 마케팅 분석 with Python
  • LLM MASTER BOOK
    • OpenAI API 쿡북 with Python
    • 기초부터 심화까지 RAG 쿡북 with Python
    • MCP 에이전트 쿡북 with Python
  • LLMs
    • OpenAI API
      • 1️⃣ChatCompletion
      • 2️⃣DALL-E
      • 3️⃣Text to Speech
      • 4️⃣Video to Transcripts
      • 5️⃣Assistants API
      • 6️⃣Prompt Engineering
      • 7️⃣OpenAI New GPT-4o
    • LangChain
      • LangChain Basic
        • 1️⃣Basic Modules
        • 2️⃣Model I/O
        • 3️⃣Prompts
        • 4️⃣Chains
        • 5️⃣Agents
        • 6️⃣Tools
        • 7️⃣Memory
      • LangChain Intermediate
        • 1️⃣OpenAI LLM
        • 2️⃣Prompt Template
        • 3️⃣Retrieval
        • 4️⃣RAG ChatBot
        • 5️⃣RAG with Gemini
        • 6️⃣New Huggingface-LangChain
        • 7️⃣Huggingface Hub
        • 8️⃣SQL Agent & Chain
        • 9️⃣Expression Language(LCEL)
        • 🔟Llama3-8B with LangChain
      • LangChain Advanced
        • 1️⃣LLM Evaluation
        • 2️⃣RAG Evaluation with RAGAS
        • 3️⃣LangChain with RAGAS
        • 4️⃣RAG Paradigms
        • 5️⃣LangChain: Advance Techniques
        • 6️⃣LangChain with NeMo-Guardrails
        • 7️⃣LangChain vs. LlamaIndex
        • 8️⃣LangChain LCEL vs. LangGraph
    • LlamaIndex
      • LlamaIndex Basic
        • 1️⃣Introduction
        • 2️⃣Customization
        • 3️⃣Data Connectors
        • 4️⃣Documents & Nodes
        • 5️⃣Naive RAG
        • 6️⃣Advanced RAG
        • 7️⃣Llama3-8B with LlamaIndex
        • 8️⃣LlmaPack
      • LlamaIndex Intermediate
        • 1️⃣QueryEngine
        • 2️⃣Agent
        • 3️⃣Evaluation
        • 4️⃣Evaluation-Driven Development
        • 5️⃣Fine-tuning
        • 6️⃣Prompt Compression with LLMLingua
      • LlamaIndex Advanced
        • 1️⃣Agentic RAG: Router Engine
        • 2️⃣Agentic RAG: Tool Calling
        • 3️⃣Building Agent Reasoning Loop
        • 4️⃣Building Multi-document Agent
    • Hugging Face
      • Huggingface Basic
        • 1️⃣Datasets
        • 2️⃣Tokenizer
        • 3️⃣Sentence Embeddings
        • 4️⃣Transformers
        • 5️⃣Sentence Transformers
        • 6️⃣Evaluate
        • 7️⃣Diffusers
      • Huggingface Tasks
        • NLP
          • 1️⃣Sentiment Analysis
          • 2️⃣Zero-shot Classification
          • 3️⃣Aspect-Based Sentiment Analysis
          • 4️⃣Feature Extraction
          • 5️⃣Intent Classification
          • 6️⃣Topic Modeling: BERTopic
          • 7️⃣NER: Token Classification
          • 8️⃣Summarization
          • 9️⃣Translation
          • 🔟Text Generation
        • Audio & Tabular
          • 1️⃣Text-to-Speech: TTS
          • 2️⃣Speech Recognition: Whisper
          • 3️⃣Audio Classification
          • 4️⃣Tabular Qustaion & Answering
        • Vision & Multimodal
          • 1️⃣Image-to-Text
          • 2️⃣Text to Image
          • 3️⃣Image to Image
          • 4️⃣Text or Image-to-Video
          • 5️⃣Depth Estimation
          • 6️⃣Image Classification
          • 7️⃣Object Detection
          • 8️⃣Segmentatio
      • Huggingface Optimization
        • 1️⃣Accelerator
        • 2️⃣Bitsandbytes
        • 3️⃣Flash Attention
        • 4️⃣Quantization
        • 5️⃣Safetensors
        • 6️⃣Optimum-ONNX
        • 7️⃣Optimum-NVIDIA
        • 8️⃣Optimum-Intel
      • Huggingface Fine-tuning
        • 1️⃣Transformer Fine-tuning
        • 2️⃣PEFT Fine-tuning
        • 3️⃣PEFT: Fine-tuning with QLoRA
        • 4️⃣PEFT: Fine-tuning Phi-2 with QLoRA
        • 5️⃣Axoltl Fine-tuning with QLoRA
        • 6️⃣TRL: RLHF Alignment Fine-tuning
        • 7️⃣TRL: DPO Fine-tuning with Phi-3-4k-instruct
        • 8️⃣TRL: ORPO Fine-tuning with Llama3-8B
        • 9️⃣Convert GGUF gemma-2b with llama.cpp
        • 🔟Apple Silicon Fine-tuning Gemma-2B with MLX
        • 🔢LLM Mergekit
    • Agentic LLM
      • Agentic LLM
        • 1️⃣Basic Agentic LLM
        • 2️⃣Multi-agent with CrewAI
        • 3️⃣LangGraph: Multi-agent Basic
        • 4️⃣LangGraph: Agentic RAG with LangChain
        • 5️⃣LangGraph: Agentic RAG with Llama3-8B by Groq
      • Autonomous Agent
        • 1️⃣LLM Autonomous Agent?
        • 2️⃣AutoGPT: Worldcup Winner Search with LangChain
        • 3️⃣BabyAGI: Weather Report with LangChain
        • 4️⃣AutoGen: Writing Blog Post with LangChain
        • 5️⃣LangChain: Autonomous-agent Debates with Tools
        • 6️⃣CAMEL Role-playing Autonomous Cooperative Agents
        • 7️⃣LangChain: Two-player Harry Potter D&D based CAMEL
        • 8️⃣LangChain: Multi-agent Bid for K-Pop Debate
        • 9️⃣LangChain: Multi-agent Authoritarian Speaker Selection
        • 🔟LangChain: Multi-Agent Simulated Environment with PettingZoo
    • Multimodal
      • 1️⃣PaliGemma: Open Vision LLM
      • 2️⃣FLUX.1: Generative Image
    • Building LLM
      • 1️⃣DSPy
      • 2️⃣DSPy RAG
      • 3️⃣DSPy with LangChain
      • 4️⃣Mamba
      • 5️⃣Mamba RAG with LangChain
      • 7️⃣PostgreSQL VectorDB with pgvorco.rs
Powered by GitBook
On this page
  • About Axolotl
  • Features:
  • Axoltl Fine-tuning Q-LoRA Tutorial
  • Install Axolotl & dependencies
  • Create yaml config file
  • Launch the training
  • Inference
  1. LLMs
  2. Hugging Face
  3. Huggingface Fine-tuning

Axoltl Fine-tuning with QLoRA

PreviousPEFT: Fine-tuning Phi-2 with QLoRANextTRL: RLHF Alignment Fine-tuning

Last updated 1 year ago

About Axolotl

Axolotl은 CLI로 간단하게 LLM을 Fine-tuning 할 수 있습니다. 다양한 AI 모델의 미세 조정을 간소화하도록 설계된 도구로, 여러 구성과 아키텍처를 지원합니다.

Features:

  • Llama, Gemma, Mixtral 등 다양한 Huggingface Model을 Fine-tune 가능

  • Fine-tune, LoRA, QLoRA, ReLoRA, gptq 등을 지원

  • 간단한 yaml 파일 또는 CLI 덮어쓰기를 사용하여 구성 사용자 지정

  • 다양한 데이터 세트 형식 로드, 사용자 지정 형식 사용, 또는 자체 토큰화된 데이터 세트 가져오기

  • xformer, flash attention, rope scaling, multipacking을 통합하여 지원

  • FSDP 또는 Deepspeed를 통해 단일 GPU 또는 여러 GPU와 함께 작동

  • 로컬 또는 클라우드에서 Docker로 손쉽게 실행 가능

  • 결과 및 선택적으로 체크포인트를 wandb(Weight & Bias 계정 연동)에 로그 기록

실행은 Command Line에서 직접, Jupyter Notebook이나 Colab에서 실행 가능합니다.

Axoltl Fine-tuning Q-LoRA Tutorial

import torch
assert (torch.cuda.is_available()==True)

Install Axolotl & dependencies

Axotl github에서 repository를 불러와서 설치를 합니다. 또한 Acceleration과 Optimize 하는 라이브러리를 함께 설치합니다.

%pip install torch=="2.1.2"
%pip install -e git+https://github.com/OpenAccess-AI-Collective/axolotl#egg=axolotl

%cd axolotl
%pip install flash-attn=="2.5.0"
%pip install deepspeed=="0.13.1"
%pip install mlflow=="2.13.0"

Create yaml config file

Configuration yaml 파일을 만듭니다. yaml 파일에 Model, Dataset, Accelerate, Flash Attention, Peft, LoRA 등을 모두 설정 가능합니다. 여기서는 아래와 같이 설정했습니다.

  • base_model: google/gemma-2b-it

  • Dataset: nlpai-lab/databricks-dolly-15k-ko

  • Adapter: QLoRA

import yaml

# Your YAML string
yaml_string = """
# use google/gemma-7b if you have access
base_model: google/gemma-2b-it
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: true
strict: false

# huggingface repo
datasets:
  - path: nlpai-lab/databricks-dolly-15k-ko
    type: alpaca
val_set_size: 0.1
output_dir: ./outputs/out

adapter: qlora
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true

sequence_len: 4096
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:


gradient_accumulation_steps: 3
micro_batch_size: 2
num_epochs: 4
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16: 
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_ratio: 0.1
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:


"""

# Convert the YAML string to a Python dictionary
yaml_dict = yaml.safe_load(yaml_string)

# Specify your file path
file_path = 'gemma-2b_axolotl.yaml'

# Write the YAML file
with open(file_path, 'w') as file:
    yaml.dump(yaml_dict, file)

Launch the training

이제 간단한 CLI 명령어로 이미 설정한 gemma-2b_axolotl.yaml 을 지정해서 Fine-tuning 학습을 진행합니다.

!accelerate launch -m axolotl.cli.train gemma-2b_axolotl.yaml

학습이 완료되면 yaml에서 지정한 output_dir: ./outputs/out 경로에 Fine-tuning 결과가 저장됩니다. 이를 Huggingface에서 push 가능합니다.

Inference

마찬가지로 Inference를 지정하여 실행 가능합니다. --gradio로 설정하면 gradio가 자동으로 모델을 실행하게 됩니다.

!accelerate launch -m axolotl.cli.inference gemma-2b_axolotl.yaml \
    --qlora_model_dir="./qlora-out" --gradio
5️⃣
GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questionsGitHub
Logo