Skip to content

Coding-Devil/AutoML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Auto-ML : Meta-Modal Distributed ML Framework 🚀

Welcome to the Meta-Modal Distributed Machine Learning Framework, a next-gen solution for scaling tabular and image data models using cutting-edge tools like PyTorch Lightning, Ray Tune, and advanced distributed computing methodologies. This repository is designed for developers and researchers alike, offering powerful scalability and streamlined automation for your machine learning pipelines. 💡

🌐 Overview

This project focuses on building, training, and deploying a multi-modal ML model that integrates tabular and image inputs. It is built for scalability using distributed training techniques, enabling rapid development even for large datasets. It’s perfect for anyone looking to explore hyperparameter tuning, model compression, and scalable deployment.

Highlights

  • Multi-Modal Data Processing: Combines image and tabular data for more insightful predictions.
  • Distributed Scalability: Powered by PyTorch Lightning’s DDP (Distributed Data Parallel) to scale training across multiple devices.
  • Automated Hyperparameter Tuning: Efficient tuning via Ray Tune for better model performance.
  • Optimized for Deployment: Supports model compression for deployment on low-resource environments.
  • Future-Ready Architecture: Easily extendable for real-world large-scale applications.

🔧 Installation and Setup

Step 1: Clone the Repository

git clone https://github.com/username/multi-modal-ml.git
cd multi-modal-ml

Step 2: Install Dependencies

Ensure you have the necessary packages installed:

pip install -r requirements.txt

Dependencies include: torch, torchvision, pytorch-lightning, ray, scikit-learn, pandas, pillow.

Step 3: Prepare the Data

Organize your datasets:

  • Tabular Data: Should be in CSV format and ready for preprocessing.
  • Images: Images should be placed in a well-structured directory.

🏗️ How to Use

Dataset Preparation

  • Ensure your tabular data is preprocessed and structured correctly.
  • Modify the MultiModalDataset class in the scripts if necessary to fit your data format.

Training

Run single-device training:

python scripts/train_single.py

Or for distributed training:

python -m torch.distributed.launch --nproc_per_node=<number_of_devices> scripts/train_distributed.py

Hyperparameter Tuning

To perform distributed hyperparameter tuning with Ray Tune:

python scripts/hyperparam_tune.py

Model Compression & Deployment

Run the following script to compress and save your trained model for efficient deployment:

python scripts/compress_and_save.py

📂 Project Structure

multi-modal-ml/
│
├── data/                         # Data storage
│   ├── images/                   # Directory for images
│   └── tabular_data.csv          # CSV for tabular data
│
├── models/                       # Model architecture and custom layers
│
├── scripts/                      # Scripts for training, tuning, and deployment
│   ├── train_single.py           # Single device training
│   ├── train_distributed.py      # Multi-device distributed training
│   ├── hyperparam_tune.py        # Hyperparameter tuning with Ray Tune
│   └── compress_and_save.py      # Model compression for deployment
│
└── README.md                     # Documentation

🔮 Future Improvements

  • Text & Audio Modalities: Extend support to other data types, including text and audio.
  • Advanced Compression: Explore methods like pruning to further minimize model size.
  • Cloud Deployment: Prepare the system for cloud-based environments (AWS, Azure, GCP).

🙌 Acknowledgments

Big thanks to the open-source community and contributors for developing tools like PyTorch, Ray, and Lightning which make large-scale machine learning accessible. 🌍


💼 Contributing

Contributions are welcome! Submit a pull request, raise an issue, or suggest new features.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published