diff --git a/ai/ml-notes/fine-tuning-notes.md b/ai/ml-notes/fine-tuning-notes/fine-tuning-notes.md
similarity index 100%
rename from ai/ml-notes/fine-tuning-notes.md
rename to ai/ml-notes/fine-tuning-notes/fine-tuning-notes.md
diff --git a/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md b/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md
new file mode 100644
index 00000000..88f490c3
--- /dev/null
+++ b/ai/ml-notes/fine-tuning-notes/fine-tuning-tools.md
@@ -0,0 +1,134 @@
+# Fine-Tuning Tools and Frameworks - Notes
+
+## Table of Contents
+  - [Introduction](#introduction)
+  - [Key Concepts](#key-concepts)
+  - [Applications](#applications)
+  - [Fine-Tuning Workflow](#fine-tuning-workflow)
+  - [Popular Tools and Frameworks](#popular-tools-and-frameworks)
+  - [Self-Practice / Hands-On Examples](#self-practice--hands-on-examples)
+  - [Pitfalls & Challenges](#pitfalls--challenges)
+  - [Feedback & Evaluation](#feedback--evaluation)
+  - [Hello World! (Practical Example)](#hello-world-practical-example)
+  - [Advanced Exploration](#advanced-exploration)
+  - [Zero to Hero Lab Projects](#zero-to-hero-lab-projects)
+  - [Continuous Learning Strategy](#continuous-learning-strategy)
+  - [References](#references)
+
+## Introduction
+- `Fine-tuning` is the process of adapting a pre-trained model to a new, often more specific, task. This process leverages the general knowledge learned by the model to achieve high performance on a specialized dataset with limited training.
+
+### Key Concepts
+- **Transfer Learning**: Utilizing a model trained on a broad task and adjusting it for a related, narrower task.
+- **Freezing Layers**: Locking layers in the model to prevent them from updating during training, usually done for layers that capture fundamental features.
+- **Learning Rate Scheduling**: Adjusting the learning rate during fine-tuning to stabilize training and optimize performance.
+- **Common Misconception**: Fine-tuning a model doesn’t always improve performance if the pre-trained model is not aligned with the new task.
+
+### Applications
+- **Natural Language Processing (NLP)**: Fine-tuning language models on specific domains like legal, medical, or scientific text.
+- **Computer Vision**: Adapting pre-trained models for object detection, image classification, or segmentation on custom datasets.
+- **Speech Recognition**: Tuning models on specific accents, languages, or vocabulary.
+- **Recommender Systems**: Using fine-tuning to improve model recommendations for niche markets or specialized interests.
+- **Healthcare Diagnostics**: Tailoring models trained on general medical images to specific diagnostic tasks.
+
+## Fine-Tuning Workflow
+1. **Select a Pre-trained Model**: Choose a model pre-trained on a similar task or a general large dataset.
+2. **Prepare Data**: Collect and preprocess domain-specific data to adapt the model to the new task.
+3. **Freeze Layers**: Decide which layers to freeze or keep unfrozen depending on similarity to the original task.
+4. **Adjust Parameters**: Set learning rate, batch size, and other hyperparameters.
+5. **Train on Target Task**: Begin fine-tuning, with or without frozen layers, to update the model.
+6. **Evaluate and Optimize**: Assess the model on validation data and make adjustments if needed.
+
+## Popular Tools and Frameworks
+
+### General Frameworks
+1. **TensorFlow and Keras**:
+   - *Overview*: Offers high-level APIs for fine-tuning and transfer learning.
+   - *Pros*: Easy-to-use, supports layer freezing, and compatible with TensorFlow Lite for deployment.
+   - *Cons*: Higher computational requirements compared to lighter frameworks.
+2. **PyTorch**:
+   - *Overview*: Highly flexible and widely used for fine-tuning tasks.
+   - *Pros*: Dynamic computation graph, strong community, supports Hugging Face integration.
+   - *Cons*: Requires more customization, especially for deployment.
+
+### Specialized Fine-Tuning Tools
+1. **Hugging Face Transformers**:
+   - *Overview*: Comprehensive library for NLP and vision models, with strong fine-tuning support.
+   - *Pros*: Extensive model repository, easy-to-use APIs, integrates with PyTorch.
+   - *Cons*: Mostly focused on NLP, though support for vision models is growing.
+2. **Transfer Learning Toolkit by NVIDIA (TLT)**:
+   - *Overview*: Designed for accelerated transfer learning in vision, particularly for NVIDIA hardware.
+   - *Pros*: Optimized for GPU use, supports domain adaptation, includes pruning.
+   - *Cons*: Limited to specific NVIDIA ecosystems.
+3. **Ultralytics YOLO**:
+   - *Overview*: Tool for fine-tuning the YOLO object detection model on custom datasets.
+   - *Pros*: High performance, fast fine-tuning for object detection.
+   - *Cons*: Primarily for object detection and less adaptable to other tasks.
+
+### Domain-Specific Frameworks
+1. **AutoGluon**:
+   - *Overview*: Simplifies model tuning for tabular, image, and text data.
+   - *Pros*: AutoML-based approach, optimized for fast deployment.
+   - *Cons*: Limited customization, mostly geared towards entry-level users.
+2. **FastAI**:
+   - *Overview*: Built on PyTorch, focuses on high-level APIs for vision and text fine-tuning.
+   - *Pros*: Beginner-friendly, effective for vision and tabular data.
+   - *Cons*: Less modular for advanced customizations, narrower model choices.
+
+## Self-Practice / Hands-On Examples
+1. **Fine-tune BERT with Hugging Face**: Use Hugging Face Transformers to fine-tune BERT on a custom text dataset.
+2. **Image Classification with Transfer Learning**: Fine-tune a ResNet model in PyTorch on a small image dataset.
+3. **Object Detection with Ultralytics YOLO**: Use YOLO for fine-tuning on a custom object detection dataset.
+4. **AutoGluon for Tabular Data**: Apply AutoGluon to a tabular dataset and fine-tune the model with minimal setup.
+5. **FastAI with Domain-Specific Images**: Fine-tune a model in FastAI on a unique set of images (e.g., medical or satellite).
+
+## Pitfalls & Challenges
+- **Overfitting**: Fine-tuning with limited data can lead to overfitting on specific features of the new dataset.
+- **Misalignment with Pre-trained Model**: Using a model that wasn’t trained on related data may hinder rather than help.
+- **Catastrophic Forgetting**: The model may lose generalization capabilities if fine-tuning changes weights too drastically.
+- **Hyperparameter Tuning**: Optimizing learning rates, regularization, and batch sizes is essential to avoid unstable training.
+
+## Feedback & Evaluation
+- **Self-Explanation**: Describe the steps and choices in fine-tuning, focusing on why specific parameters were set.
+- **Validation Metrics**: Compare model performance pre- and post-fine-tuning using accuracy, F1, or precision.
+- **Domain-Specific Testing**: Evaluate the fine-tuned model on the new task’s unique requirements.
+
+## Hello World! (Practical Example)
+- **Fine-tuning a ResNet Model in PyTorch**:
+  ```python
+  import torch
+  from torchvision import models, transforms
+  from torch import nn, optim
+
+  # Load pre-trained ResNet model
+  model = models.resnet18(pretrained=True)
+  
+  # Freeze all layers except the last
+  for param in model.parameters():
+      param.requires_grad = False
+  model.fc = nn.Linear(model.fc.in_features, 10)  # Update for 10 classes
+
+  # Prepare data and fine-tune
+  optimizer = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
+  criterion = nn.CrossEntropyLoss()
+  # Training loop to be added here as per task specifics.
+  ```
+
+## Advanced Exploration
+- **Research Papers on Transfer Learning**: Explore recent studies on advancements in fine-tuning techniques.
+- **Curriculum Learning**: A method for structuring data exposure to improve fine-tuning results.
+- **Contrastive Fine-Tuning**: Study contrastive learning approaches to enhance model adaptability to new tasks.
+
+## Zero to Hero Lab Projects
+1. **Project 1**: Fine-tune a speech recognition model for a unique accent or dialect.
+2. **Project 2**: Create a custom object detector for detecting specific items in an industrial setting using Ultralytics YOLO.
+3. **Project 3**: Adapt a natural language model for summarizing legal text using Hugging Face Transformers.
+
+## Continuous Learning Strategy
+- **Next Steps**: Try distilling a fine-tuned model to further optimize it for deployment.
+- **Related Topics**: Explore Distillation, Model Compression, and Architecture Search to deepen understanding.
+
+## References
+- *Transfer Learning with Convolutional Neural Networks for Medical Imaging* by Rajpurkar et al.
+- *A Comprehensive Survey on Transfer Learning* by Pan and Yang.
+- Hugging Face documentation on fine-tuning: [https://huggingface.co/docs](https://huggingface.co/docs)
\ No newline at end of file
diff --git a/ai/ml-notes/fine-tuning-notes/model-compression-notes.md b/ai/ml-notes/fine-tuning-notes/model-compression-notes.md
new file mode 100644
index 00000000..9fb6b887
--- /dev/null
+++ b/ai/ml-notes/fine-tuning-notes/model-compression-notes.md
@@ -0,0 +1,131 @@
+# Model Compression - Notes
+
+## Table of Contents
+- [Introduction](#introduction)
+- [Key Concepts](#key-concepts)
+- [Applications](#applications)
+- [Model Compression Techniques](#model-compression-techniques)
+- [Compression Process & Pipelines](#model-compression-process--pipelines)
+- [Key Models and Frameworks](#key-models-and-frameworks)
+- [Self-Practice / Hands-On Examples](#self-practice--hands-on-examples)
+- [Pitfalls & Challenges](#pitfalls--challenges)
+- [Feedback & Evaluation](#feedback--evaluation)
+- [Tools, Libraries & Frameworks](#tools-libraries--frameworks)
+- [Hello World! (Practical Example)](#hello-world-practical-example)
+- [Advanced Exploration](#advanced-exploration)
+- [Zero to Hero Lab Projects](#zero-to-hero-lab-projects)
+- [Continuous Learning Strategy](#continuous-learning-strategy)
+- [References](#references)
+
+## Introduction
+- `Model compression` is a set of techniques used to reduce the size and computational demands of machine learning models without significantly compromising performance, enabling deployment on resource-constrained devices like smartphones, IoT, or embedded systems.
+
+### Key Concepts
+- **Quantization**: Reducing the precision of numbers in the model (e.g., 32-bit floating points to 8-bit integers).
+- **Pruning**: Removing less significant weights or neurons that contribute minimally to the model's output.
+- **Knowledge Distillation**: Training a smaller model (student) to mimic the outputs of a larger model (teacher).
+- **Neural Architecture Search (NAS)**: Optimizing the architecture to balance size and performance.
+- **Common Misconception**: Smaller models necessarily lose accuracy—effective compression can retain most of a model’s original accuracy.
+
+### Applications
+- **Mobile Applications**: Compressing models for on-device AI, like image recognition or speech processing.
+- **IoT & Edge Computing**: Enabling real-time inference on constrained devices for tasks such as anomaly detection, security, and monitoring.
+- **Autonomous Vehicles**: Reducing computational loads in embedded systems for object detection or navigation.
+- **Healthcare**: Deploying models for diagnostics on portable medical devices where memory and power are limited.
+- **Environmental Monitoring**: Low-power ML applications on drones or sensors for real-time analysis in remote areas.
+
+## Model Compression Techniques
+- **Quantization**: Converts high-precision weights and activations to lower precision, reducing model size and computation.
+- **Pruning**: Eliminates redundant parameters, often by removing nodes or filters with minimal effect.
+- **Knowledge Distillation**: Trains a small model to mimic the "knowledge" or outputs of a larger model.
+- **Weight Sharing**: Limits the diversity of weights in the model by sharing parameters among similar layers.
+- **Neural Architecture Search (NAS)**: Optimizes model structure specifically for reduced size or processing requirements.
+
+### Description
+- **Quantization**:
+   - Example: Transforming weights from 32-bit floats to 8-bit integers can reduce model size by 75%.
+- **Pruning**:
+   - Step-by-step: (1) Identify less important parameters, (2) remove or zero them out, (3) retrain model to recover lost accuracy.
+- **Distillation**:
+   - Teacher-student training: A smaller “student” network learns from the predictions of a larger, pre-trained “teacher” network.
+
+## Model Compression Process & Pipelines
+1. **Select a Model**: Choose an initial model architecture.
+2. **Quantize or Prune**: Apply quantization or pruning to reduce model complexity.
+3. **Distill Knowledge**: If applicable, train a compact model to emulate a larger one’s behavior.
+4. **Optimize**: Fine-tune to regain any lost accuracy.
+5. **Validate & Deploy**: Test the model on real-world data to ensure it meets requirements.
+
+### Example Pipeline
+
+```mermaid
+graph LR;
+    Start(Original Model) --> Quantize[Quantization] --> Prune[Pruning] --> Distill[Knowledge Distillation]
+    Distill --> FineTune(Fine-Tuning) --> Validate[Validate on Dataset] --> Deploy[Deploy Model]
+```
+
+## Key Models and Frameworks
+- **MobileNet**: Designed specifically for mobile and edge applications with minimal parameters.
+- **EfficientNet**: Uses compound scaling to balance width, depth, and resolution.
+- **YOLO-Nano**: Lightweight version of YOLO for real-time object detection on limited hardware.
+- **TensorFlow Lite** and **ONNX Runtime**: Frameworks optimized for deploying compressed models on edge devices.
+- **TinyML Models**: Specialized models for ultra-low-power applications, such as in microcontrollers.
+
+## Self-Practice / Hands-On Examples
+1. **Quantization Exercise**: Apply quantization to a neural network model in TensorFlow Lite and observe the impact on size and accuracy.
+2. **Pruning Experiment**: Use PyTorch to prune a model layer-by-layer and monitor changes in performance.
+3. **Knowledge Distillation**: Implement a teacher-student model and compare the student’s performance against the original large model.
+4. **NAS with MobileNet**: Use NAS tools to create a smaller variant of MobileNet tailored for a specific edge device.
+5. **Benchmarking**: Compare the inference times of compressed models versus the original on an edge device like Raspberry Pi.
+
+## Pitfalls & Challenges
+- **Loss of Accuracy**: Over-compression can lead to significant drops in model performance.
+- **Inference Latency**: Compressed models may still suffer from high latency depending on the device.
+- **Compatibility Issues**: Different devices and frameworks may not support all compression techniques.
+- **Training Complexity**: Techniques like knowledge distillation require additional training stages.
+
+## Feedback & Evaluation
+- **Feynman Test**: Explain model compression techniques as if teaching someone with no AI experience.
+- **Real-World Simulation**: Deploy on a constrained device and measure performance metrics.
+- **Benchmark Testing**: Use model accuracy, latency, and memory usage to evaluate effectiveness.
+
+## Tools, Libraries & Frameworks
+1. **TensorFlow Lite**: Great for quantization and deployment on mobile/edge devices.
+2. **ONNX Runtime**: Supports model optimization for edge environments.
+3. **PyTorch Quantization Toolkit**: Offers quantization and pruning utilities.
+4. **Distiller by Neural Network Compression Framework**: Open-source tool for pruning and quantization in PyTorch.
+5. **Apache TVM**: Compiles models for optimized edge inference across diverse hardware.
+
+## Hello World! (Practical Example)
+- **Quantization Example** in TensorFlow Lite:
+  ```python
+  import tensorflow as tf
+
+  # Convert model to TFLite format with quantization
+  converter = tf.lite.TFLiteConverter.from_saved_model("model_path")
+  converter.optimizations = [tf.lite.Optimize.DEFAULT]
+  tflite_model = converter.convert()
+
+  # Save and deploy tflite_model
+  with open("compressed_model.tflite", "wb") as f:
+      f.write(tflite_model)
+  ```
+
+## Advanced Exploration
+- **Model Compression Survey**: Dive into recent research papers on model compression trends.
+- **Distillation Techniques for Vision Models**: Advanced articles on distilling complex vision models.
+- **Quantization Aware Training**: Learn techniques that incorporate quantization during training to retain accuracy.
+
+## Zero to Hero Lab Projects
+- **Project 1**: Build and compress a model for real-time object detection on Raspberry Pi.
+- **Project 2**: Develop a distillation pipeline for a large language model using PyTorch.
+- **Project 3**: Use pruning and quantization to optimize a healthcare diagnostic model for a portable device.
+
+## Continuous Learning Strategy
+- **Next Steps**: Explore hardware-specific optimizations for models (like NVIDIA TensorRT).
+- **Related Topics**: Dive into Edge AI, TinyML, and Embedded Systems for further specialization.
+
+## References
+- *EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks* by Mingxing Tan and Quoc V. Le
+- *Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding* by Song Han et al.
+- *Knowledge Distillation* by Geoffrey Hinton, Oriol Vinyals, and Jeff Dean