Skip to content

🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama πŸ¦™, and Mistral πŸŒͺ️ for various NLP tasks. πŸ”§ It includes training πŸ“š, fine-tuning πŸ› οΈ, and inference pipelines βš™οΈ. πŸš€

Notifications You must be signed in to change notification settings

Abeshith/FineTuning_LanguageModels

Repository files navigation

🎯 Fine-Tuning Language Models

A comprehensive, beginner-friendly repository for fine-tuning large language models using modern techniques like LoRA, DPO, and Unsloth optimizations.

Banner

License: MIT Python 3.8+ Colab

πŸš€ Quick Start (5 Minutes)

# Install dependencies
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

# Load and fine-tune in 3 lines
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained("unsloth/llama-3-8b-bnb-4bit")
model = FastLanguageModel.get_peft_model(model, r=16)

# Add your training code here

🧭 Navigation Guide

πŸ“ Folder 🎯 Purpose πŸ‘₯ Best For ⏱️ Time Needed
FineTuning LanguageModels Basic concepts & simple examples Beginners, students 1-2 hours
FineTuning LargeLanguageModels Advanced techniques & model-specific guides Practitioners, researchers 3-5 hours
FineTuning VisionModel Vision-language model fine-tuning Computer vision researchers 2-3 hours
Advanced FineTuning Methods Multi-LoRA & LoRA composition techniques Advanced researchers, ML engineers 3-4 hours
Task Specific FineTuning Specialized task fine-tuning (Code generation) Domain specialists, developers 2-3 hours
Reinforcement Learning Based FineTning Human preference alignment (DPO, RLHF) Advanced users, alignment researchers 2-4 hours
docs/ Comprehensive guides & troubleshooting All skill levels Reference

πŸ€” Which Technique Should I Use?

πŸ”° I'm New to Fine-Tuning

β†’ Start with Basic LoRA Fine-tuning

  • βœ… Easy to understand
  • βœ… Low memory requirements
  • βœ… Good results for most tasks

πŸ’» I Have Limited GPU Memory

β†’ Use 4-bit Quantized LoRA

  • βœ… Works on Google Colab T4
  • βœ… 70% less memory usage
  • βœ… Minimal performance loss

🎯 I Want Better Instruction Following

β†’ Try DPO Training

  • βœ… Improves response quality
  • βœ… Better human alignment
  • βœ… No reward model needed

πŸ–ΌοΈ I Need Vision-Language Models

β†’ Use Vision Model Fine-tuning

  • βœ… Image + text processing
  • βœ… Mathematical formula recognition
  • βœ… Multimodal AI applications

🧬 I Need Multiple Domain Expertise

β†’ Try Multi-LoRA Methods

  • βœ… Multiple skills without forgetting
  • βœ… Dynamic adapter switching
  • βœ… Combine different expertises

πŸ’» I Need Specialized Code Generation

β†’ Use Task-Specific Fine-tuning

  • βœ… Advanced code generation
  • βœ… Multi-language programming support
  • βœ… Optimized with Qwen2.5-Coder

πŸ“Š Memory & Performance Guide

Model Size Min GPU Memory Training Time (100 steps) Best Use Case
Phi-3 Mini 3.8B 6GB 10 mins Coding tasks, efficiency
Llama-3-8B 8B 12GB 20 mins General conversation
Qwen2.5-7B 7.6B 10GB 18 mins Multilingual, math
Qwen2-VL-7B 7B 8GB 25 mins Vision-language tasks

πŸ’‘ Tip: All memory requirements assume 4-bit quantization with LoRA

πŸ› οΈ Installation

Option 1: Google Colab (Recommended)

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "trl<0.9.0" transformers datasets

Option 2: Local Installation

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install "unsloth[cu121-ampere-torch230] @ git+https://github.com/unslothai/unsloth.git"

πŸ”§ Troubleshooting

❌ Getting CUDA out of memory errors? β†’ See Memory Optimization Guide

❌ Model not following instructions?
β†’ Try DPO Training

❌ Training too slow? β†’ Check Performance Optimization Tips

πŸ“š Learning Path

  1. Week 1: Basic Fine-tuning Concepts
  2. Week 2: LoRA and PEFT Techniques
  3. Week 3: Vision-Language Models
  4. Week 4: Human Preference Alignment
  5. Week 5: Advanced Multi-LoRA Methods
  6. Week 6: Task-Specific Applications

πŸ› οΈ Technologies

Core Technologies

Python PyTorch πŸ€— Transformers Unsloth Jupyter CUDA LoRA PEFT

Advanced Optimization

TRL BitsAndBytes Datasets Accelerate WandB DPO RLHF Quantization

πŸ€” What is Fine-Tuning?

Fine-tuning is like teaching a smart student a new skill:

  • 🧠 Pre-trained model = Student who already knows language
  • πŸ“š Your dataset = Textbook for the new skill
  • βš™οΈ Fine-tuning = Practice sessions to master the skill
  • 🎯 Fine-tuned model = Expert in your specific task

πŸ’‘ Why Fine-Tuning?

Approach Cost Time Data Needed Results
Training from scratch $1M+ Months Billions of tokens 100%
Fine-tuning $10-100 Hours Thousands of examples 95%+

βœ… 99% cost reduction while maintaining excellent performance!

🧠 Understanding the Process

Step 1: Choose Your Base Model

# Start with a pre-trained model
model = "unsloth/llama-3-8b-bnb-4bit"  # Already knows language

Step 2: Add Task-Specific Training

# Your specialized dataset
dataset = [
    {"instruction": "Explain photosynthesis", "output": "Photosynthesis is..."},
    {"instruction": "Write code", "output": "def function():..."}
]

Step 3: Efficient Training with LoRA

# Train only 1% of parameters
model = FastLanguageModel.get_peft_model(model, r=16)  # LoRA magic

🎯 Repository Overview

This repository contains everything you need to master fine-tuning, from beginner-friendly tutorials to advanced optimization techniques.

πŸ”° Beginner Level

Learn the fundamentals with hands-on examples and step-by-step guides.

πŸš€ Intermediate Level

Explore advanced techniques like DPO training and model optimization.

πŸ”¬ Expert Level

Master vision-language models and human preference alignment.

πŸ†˜ Need Help?

πŸ“„ License

MIT License - see LICENSE file for details.


⭐ Star this repo if it helped you fine-tune your models!

πŸ”„ Full Fine-Tuning

Updates all model parameters during training. Requires high computational resources but achieves best task-specific performance.

⚑ Parameter-Efficient Fine-Tuning (PEFT)

Updates only a small subset of parameters with 90% reduction in computational cost while matching full fine-tuning performance.

πŸ“Š Quantization: Memory Optimization

Quantization reduces model weight precision from 32-bit to lower precision (8-bit, 4-bit), dramatically reducing memory usage while maintaining performance.

🎯 Precision Formats

  • FP32: 4 bytes per parameter (baseline)
  • INT8: 1 byte per parameter (75% memory reduction)
  • INT4: 0.5 bytes per parameter (87.5% memory reduction)

πŸ“ˆ Example: 7B Parameter Model

FP32: 7B Γ— 4 bytes = 28 GB
INT8: 7B Γ— 1 byte = 7 GB (75% reduction)
INT4: 7B Γ— 0.5 bytes = 3.5 GB (87.5% reduction)

🎯 LoRA: Low-Rank Adaptation

LoRA decomposes weight updates into low-rank matrices, reducing trainable parameters by 99% while maintaining performance.

πŸ“ Mathematical Foundation

W' = W + Ξ”W
Ξ”W = A Γ— B
  • W: Original pre-trained weights (frozen)
  • A: Low-rank matrix (d Γ— r)
  • B: Low-rank matrix (r Γ— k)
  • r: Rank (much smaller than d or k)

πŸ”’ Rank Selection Guidelines

  • r = 8-16: Standard choice, good balance
  • r = 32-64: Complex adaptations
  • r = 128+: Approaching full fine-tuning

πŸ“Š Parameter Reduction Example

4096 Γ— 4096 layer with r=16:

Original: 4096 Γ— 4096 = 16,777,216 parameters
LoRA: (4096 Γ— 16) + (16 Γ— 4096) = 131,072 parameters
Reduction: 99.2% fewer parameters

πŸ”§ Adapters: Modular Fine-Tuning

Small neural network modules inserted between transformer layers for task-specific adaptation.

πŸ—οΈ Architecture

Input β†’ Layer Norm β†’ Adapter β†’ Residual Connection β†’ Output

βš–οΈ LoRA vs Adapters

Aspect LoRA Adapters
Parameter Count 0.1-1% 2-4%
Training Speed Faster Moderate
Modularity Limited High

πŸ“ Repository Structure

This repository contains six specialized folders:

Foundational fine-tuning techniques and inference examples for beginners.

Advanced model-specific fine-tuning using UnSloth, LoRA, and quantization techniques.

Vision-language model fine-tuning for multimodal AI applications.

Cutting-edge Multi-LoRA and LoRA composition techniques for multiple domain expertise.

Specialized fine-tuning for domain-specific tasks like code generation.

Human preference alignment using Direct Preference Optimization (DPO) and RLHF.

πŸš€ Getting Started

# Clone repository
git clone https://github.com/Abeshith/FineTuning_LanguageModels.git
cd FineTuning_LanguageModels

# Install dependencies
pip install transformers datasets torch torchvision
pip install unsloth peft bitsandbytes trl accelerate

About

🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama πŸ¦™, and Mistral πŸŒͺ️ for various NLP tasks. πŸ”§ It includes training πŸ“š, fine-tuning πŸ› οΈ, and inference pipelines βš™οΈ. πŸš€

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published