diff --git a/ai/ml-notes/fine-tuning-notes.md b/ai/ml-notes/fine-tuning-notes/fine-tuning-notes.md similarity index 100% rename from ai/ml-notes/fine-tuning-notes.md rename to ai/ml-notes/fine-tuning-notes/fine-tuning-notes.md diff --git a/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md b/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md new file mode 100644 index 00000000..88f490c3 --- /dev/null +++ b/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md @@ -0,0 +1,134 @@ +# Fine-Tuning Tools and Frameworks - Notes + +## Table of Contents + - [Introduction](#introduction) + - [Key Concepts](#key-concepts) + - [Applications](#applications) + - [Fine-Tuning Workflow](#fine-tuning-workflow) + - [Popular Tools and Frameworks](#popular-tools-and-frameworks) + - [Self-Practice / Hands-On Examples](#self-practice--hands-on-examples) + - [Pitfalls & Challenges](#pitfalls--challenges) + - [Feedback & Evaluation](#feedback--evaluation) + - [Hello World! (Practical Example)](#hello-world-practical-example) + - [Advanced Exploration](#advanced-exploration) + - [Zero to Hero Lab Projects](#zero-to-hero-lab-projects) + - [Continuous Learning Strategy](#continuous-learning-strategy) + - [References](#references) + +## Introduction +- `Fine-tuning` is the process of adapting a pre-trained model to a new, often more specific, task. This process leverages the general knowledge learned by the model to achieve high performance on a specialized dataset with limited training. + +### Key Concepts +- **Transfer Learning**: Utilizing a model trained on a broad task and adjusting it for a related, narrower task. +- **Freezing Layers**: Locking layers in the model to prevent them from updating during training, usually done for layers that capture fundamental features. +- **Learning Rate Scheduling**: Adjusting the learning rate during fine-tuning to stabilize training and optimize performance. +- **Common Misconception**: Fine-tuning a model doesn’t always improve performance if the pre-trained model is not aligned with the new task. + +### Applications +- **Natural Language Processing (NLP)**: Fine-tuning language models on specific domains like legal, medical, or scientific text. +- **Computer Vision**: Adapting pre-trained models for object detection, image classification, or segmentation on custom datasets. +- **Speech Recognition**: Tuning models on specific accents, languages, or vocabulary. +- **Recommender Systems**: Using fine-tuning to improve model recommendations for niche markets or specialized interests. +- **Healthcare Diagnostics**: Tailoring models trained on general medical images to specific diagnostic tasks. + +## Fine-Tuning Workflow +1. **Select a Pre-trained Model**: Choose a model pre-trained on a similar task or a general large dataset. +2. **Prepare Data**: Collect and preprocess domain-specific data to adapt the model to the new task. +3. **Freeze Layers**: Decide which layers to freeze or keep unfrozen depending on similarity to the original task. +4. **Adjust Parameters**: Set learning rate, batch size, and other hyperparameters. +5. **Train on Target Task**: Begin fine-tuning, with or without frozen layers, to update the model. +6. **Evaluate and Optimize**: Assess the model on validation data and make adjustments if needed. + +## Popular Tools and Frameworks + +### General Frameworks +1. **TensorFlow and Keras**: + - *Overview*: Offers high-level APIs for fine-tuning and transfer learning. + - *Pros*: Easy-to-use, supports layer freezing, and compatible with TensorFlow Lite for deployment. + - *Cons*: Higher computational requirements compared to lighter frameworks. +2. **PyTorch**: + - *Overview*: Highly flexible and widely used for fine-tuning tasks. + - *Pros*: Dynamic computation graph, strong community, supports Hugging Face integration. + - *Cons*: Requires more customization, especially for deployment. + +### Specialized Fine-Tuning Tools +1. **Hugging Face Transformers**: + - *Overview*: Comprehensive library for NLP and vision models, with strong fine-tuning support. + - *Pros*: Extensive model repository, easy-to-use APIs, integrates with PyTorch. + - *Cons*: Mostly focused on NLP, though support for vision models is growing. +2. **Transfer Learning Toolkit by NVIDIA (TLT)**: + - *Overview*: Designed for accelerated transfer learning in vision, particularly for NVIDIA hardware. + - *Pros*: Optimized for GPU use, supports domain adaptation, includes pruning. + - *Cons*: Limited to specific NVIDIA ecosystems. +3. **Ultralytics YOLO**: + - *Overview*: Tool for fine-tuning the YOLO object detection model on custom datasets. + - *Pros*: High performance, fast fine-tuning for object detection. + - *Cons*: Primarily for object detection and less adaptable to other tasks. + +### Domain-Specific Frameworks +1. **AutoGluon**: + - *Overview*: Simplifies model tuning for tabular, image, and text data. + - *Pros*: AutoML-based approach, optimized for fast deployment. + - *Cons*: Limited customization, mostly geared towards entry-level users. +2. **FastAI**: + - *Overview*: Built on PyTorch, focuses on high-level APIs for vision and text fine-tuning. + - *Pros*: Beginner-friendly, effective for vision and tabular data. + - *Cons*: Less modular for advanced customizations, narrower model choices. + +## Self-Practice / Hands-On Examples +1. **Fine-tune BERT with Hugging Face**: Use Hugging Face Transformers to fine-tune BERT on a custom text dataset. +2. **Image Classification with Transfer Learning**: Fine-tune a ResNet model in PyTorch on a small image dataset. +3. **Object Detection with Ultralytics YOLO**: Use YOLO for fine-tuning on a custom object detection dataset. +4. **AutoGluon for Tabular Data**: Apply AutoGluon to a tabular dataset and fine-tune the model with minimal setup. +5. **FastAI with Domain-Specific Images**: Fine-tune a model in FastAI on a unique set of images (e.g., medical or satellite). + +## Pitfalls & Challenges +- **Overfitting**: Fine-tuning with limited data can lead to overfitting on specific features of the new dataset. +- **Misalignment with Pre-trained Model**: Using a model that wasn’t trained on related data may hinder rather than help. +- **Catastrophic Forgetting**: The model may lose generalization capabilities if fine-tuning changes weights too drastically. +- **Hyperparameter Tuning**: Optimizing learning rates, regularization, and batch sizes is essential to avoid unstable training. + +## Feedback & Evaluation +- **Self-Explanation**: Describe the steps and choices in fine-tuning, focusing on why specific parameters were set. +- **Validation Metrics**: Compare model performance pre- and post-fine-tuning using accuracy, F1, or precision. +- **Domain-Specific Testing**: Evaluate the fine-tuned model on the new task’s unique requirements. + +## Hello World! (Practical Example) +- **Fine-tuning a ResNet Model in PyTorch**: + ```python + import torch + from torchvision import models, transforms + from torch import nn, optim + + # Load pre-trained ResNet model + model = models.resnet18(pretrained=True) + + # Freeze all layers except the last + for param in model.parameters(): + param.requires_grad = False + model.fc = nn.Linear(model.fc.in_features, 10) # Update for 10 classes + + # Prepare data and fine-tune + optimizer = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9) + criterion = nn.CrossEntropyLoss() + # Training loop to be added here as per task specifics. + ``` + +## Advanced Exploration +- **Research Papers on Transfer Learning**: Explore recent studies on advancements in fine-tuning techniques. +- **Curriculum Learning**: A method for structuring data exposure to improve fine-tuning results. +- **Contrastive Fine-Tuning**: Study contrastive learning approaches to enhance model adaptability to new tasks. + +## Zero to Hero Lab Projects +1. **Project 1**: Fine-tune a speech recognition model for a unique accent or dialect. +2. **Project 2**: Create a custom object detector for detecting specific items in an industrial setting using Ultralytics YOLO. +3. **Project 3**: Adapt a natural language model for summarizing legal text using Hugging Face Transformers. + +## Continuous Learning Strategy +- **Next Steps**: Try distilling a fine-tuned model to further optimize it for deployment. +- **Related Topics**: Explore Distillation, Model Compression, and Architecture Search to deepen understanding. + +## References +- *Transfer Learning with Convolutional Neural Networks for Medical Imaging* by Rajpurkar et al. +- *A Comprehensive Survey on Transfer Learning* by Pan and Yang. +- Hugging Face documentation on fine-tuning: [https://huggingface.co/docs](https://huggingface.co/docs) \ No newline at end of file diff --git a/ai/ml-notes/fine-tuning-notes/model-compression-notes.md b/ai/ml-notes/fine-tuning-notes/model-compression-notes.md new file mode 100644 index 00000000..9fb6b887 --- /dev/null +++ b/ai/ml-notes/fine-tuning-notes/model-compression-notes.md @@ -0,0 +1,131 @@ +# Model Compression - Notes + +## Table of Contents +- [Introduction](#introduction) +- [Key Concepts](#key-concepts) +- [Applications](#applications) +- [Model Compression Techniques](#model-compression-techniques) +- [Compression Process & Pipelines](#model-compression-process--pipelines) +- [Key Models and Frameworks](#key-models-and-frameworks) +- [Self-Practice / Hands-On Examples](#self-practice--hands-on-examples) +- [Pitfalls & Challenges](#pitfalls--challenges) +- [Feedback & Evaluation](#feedback--evaluation) +- [Tools, Libraries & Frameworks](#tools-libraries--frameworks) +- [Hello World! (Practical Example)](#hello-world-practical-example) +- [Advanced Exploration](#advanced-exploration) +- [Zero to Hero Lab Projects](#zero-to-hero-lab-projects) +- [Continuous Learning Strategy](#continuous-learning-strategy) +- [References](#references) + +## Introduction +- `Model compression` is a set of techniques used to reduce the size and computational demands of machine learning models without significantly compromising performance, enabling deployment on resource-constrained devices like smartphones, IoT, or embedded systems. + +### Key Concepts +- **Quantization**: Reducing the precision of numbers in the model (e.g., 32-bit floating points to 8-bit integers). +- **Pruning**: Removing less significant weights or neurons that contribute minimally to the model's output. +- **Knowledge Distillation**: Training a smaller model (student) to mimic the outputs of a larger model (teacher). +- **Neural Architecture Search (NAS)**: Optimizing the architecture to balance size and performance. +- **Common Misconception**: Smaller models necessarily lose accuracy—effective compression can retain most of a model’s original accuracy. + +### Applications +- **Mobile Applications**: Compressing models for on-device AI, like image recognition or speech processing. +- **IoT & Edge Computing**: Enabling real-time inference on constrained devices for tasks such as anomaly detection, security, and monitoring. +- **Autonomous Vehicles**: Reducing computational loads in embedded systems for object detection or navigation. +- **Healthcare**: Deploying models for diagnostics on portable medical devices where memory and power are limited. +- **Environmental Monitoring**: Low-power ML applications on drones or sensors for real-time analysis in remote areas. + +## Model Compression Techniques +- **Quantization**: Converts high-precision weights and activations to lower precision, reducing model size and computation. +- **Pruning**: Eliminates redundant parameters, often by removing nodes or filters with minimal effect. +- **Knowledge Distillation**: Trains a small model to mimic the "knowledge" or outputs of a larger model. +- **Weight Sharing**: Limits the diversity of weights in the model by sharing parameters among similar layers. +- **Neural Architecture Search (NAS)**: Optimizes model structure specifically for reduced size or processing requirements. + +### Description +- **Quantization**: + - Example: Transforming weights from 32-bit floats to 8-bit integers can reduce model size by 75%. +- **Pruning**: + - Step-by-step: (1) Identify less important parameters, (2) remove or zero them out, (3) retrain model to recover lost accuracy. +- **Distillation**: + - Teacher-student training: A smaller “student” network learns from the predictions of a larger, pre-trained “teacher” network. + +## Model Compression Process & Pipelines +1. **Select a Model**: Choose an initial model architecture. +2. **Quantize or Prune**: Apply quantization or pruning to reduce model complexity. +3. **Distill Knowledge**: If applicable, train a compact model to emulate a larger one’s behavior. +4. **Optimize**: Fine-tune to regain any lost accuracy. +5. **Validate & Deploy**: Test the model on real-world data to ensure it meets requirements. + +### Example Pipeline + +```mermaid +graph LR; + Start(Original Model) --> Quantize[Quantization] --> Prune[Pruning] --> Distill[Knowledge Distillation] + Distill --> FineTune(Fine-Tuning) --> Validate[Validate on Dataset] --> Deploy[Deploy Model] +``` + +## Key Models and Frameworks +- **MobileNet**: Designed specifically for mobile and edge applications with minimal parameters. +- **EfficientNet**: Uses compound scaling to balance width, depth, and resolution. +- **YOLO-Nano**: Lightweight version of YOLO for real-time object detection on limited hardware. +- **TensorFlow Lite** and **ONNX Runtime**: Frameworks optimized for deploying compressed models on edge devices. +- **TinyML Models**: Specialized models for ultra-low-power applications, such as in microcontrollers. + +## Self-Practice / Hands-On Examples +1. **Quantization Exercise**: Apply quantization to a neural network model in TensorFlow Lite and observe the impact on size and accuracy. +2. **Pruning Experiment**: Use PyTorch to prune a model layer-by-layer and monitor changes in performance. +3. **Knowledge Distillation**: Implement a teacher-student model and compare the student’s performance against the original large model. +4. **NAS with MobileNet**: Use NAS tools to create a smaller variant of MobileNet tailored for a specific edge device. +5. **Benchmarking**: Compare the inference times of compressed models versus the original on an edge device like Raspberry Pi. + +## Pitfalls & Challenges +- **Loss of Accuracy**: Over-compression can lead to significant drops in model performance. +- **Inference Latency**: Compressed models may still suffer from high latency depending on the device. +- **Compatibility Issues**: Different devices and frameworks may not support all compression techniques. +- **Training Complexity**: Techniques like knowledge distillation require additional training stages. + +## Feedback & Evaluation +- **Feynman Test**: Explain model compression techniques as if teaching someone with no AI experience. +- **Real-World Simulation**: Deploy on a constrained device and measure performance metrics. +- **Benchmark Testing**: Use model accuracy, latency, and memory usage to evaluate effectiveness. + +## Tools, Libraries & Frameworks +1. **TensorFlow Lite**: Great for quantization and deployment on mobile/edge devices. +2. **ONNX Runtime**: Supports model optimization for edge environments. +3. **PyTorch Quantization Toolkit**: Offers quantization and pruning utilities. +4. **Distiller by Neural Network Compression Framework**: Open-source tool for pruning and quantization in PyTorch. +5. **Apache TVM**: Compiles models for optimized edge inference across diverse hardware. + +## Hello World! (Practical Example) +- **Quantization Example** in TensorFlow Lite: + ```python + import tensorflow as tf + + # Convert model to TFLite format with quantization + converter = tf.lite.TFLiteConverter.from_saved_model("model_path") + converter.optimizations = [tf.lite.Optimize.DEFAULT] + tflite_model = converter.convert() + + # Save and deploy tflite_model + with open("compressed_model.tflite", "wb") as f: + f.write(tflite_model) + ``` + +## Advanced Exploration +- **Model Compression Survey**: Dive into recent research papers on model compression trends. +- **Distillation Techniques for Vision Models**: Advanced articles on distilling complex vision models. +- **Quantization Aware Training**: Learn techniques that incorporate quantization during training to retain accuracy. + +## Zero to Hero Lab Projects +- **Project 1**: Build and compress a model for real-time object detection on Raspberry Pi. +- **Project 2**: Develop a distillation pipeline for a large language model using PyTorch. +- **Project 3**: Use pruning and quantization to optimize a healthcare diagnostic model for a portable device. + +## Continuous Learning Strategy +- **Next Steps**: Explore hardware-specific optimizations for models (like NVIDIA TensorRT). +- **Related Topics**: Dive into Edge AI, TinyML, and Embedded Systems for further specialization. + +## References +- *EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks* by Mingxing Tan and Quoc V. Le +- *Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding* by Song Han et al. +- *Knowledge Distillation* by Geoffrey Hinton, Oriol Vinyals, and Jeff Dean