MNIST-operation-pipeline

Environment

Python version 3.11.5(pyenv, MNIST-operation-pipeline)

Quick Start

Warning

The default amount of memory available for Docker on macOS is often not enough to get Airflow up and running. If enough memory is not allocated, it might lead to the webserver continuously restarting. You should allocate at least 4GB memory for the Docker Engine (ideally 8GB).

For documentation on setting up and running Docker Compose, see Airflow Docs.

Run Docker

# Set environment variable
cp .env.example .env ## create .env file then fill up variables

# Make sure to initialize Docker settings based on the above airflow documentation
docker compose build
docker compose up

Process of what you have to do first

go into Airflow web UI(id: airflow, pw: airflow)

Docker may take a while to load initially, so it may take a while to access it.

run mnist-gpu task once in the Airflow
Once the task in Airflow completed, move FE site
download one mnist data image in the internet and predict
that's all!

Continuous Training (CT)

When you predict the number, if the confidence score of predicted value is lower than 50%, input image is uploaded to your s3_bucket under /images directory

Then upload the mnlist_label.json to the s3_bucket root directory with format below.

[
    {"filename": "20231204014102_sample_image.webp", "label": 2},
    {"filename": "20231204014103_sample_image.webp", "label": 7},
    {"filename": "20231204014104_sample_image.webp", "label": 2}
]

Training will be executed based on your labeling and images on s3_bucket. It will be repeated every 30 minutes or you can run on the Airflow UI

Architecture

Links

MLFlow UI

localhost:5001

Airflow web UI

localhost:8080

FE

localhost:4321

BE

localhost:8000

Prometheus

localhost:9090

Grafana

localhost:5002

The points you should know

At first, when you run the project, it retries for 1200 seconds until a model is created in the backend because there is no registered model. so once airflow is running, connect to webserver (localhost:8080) and start the pipeline if it hasn't already been started.
Once the ML model is trained by the pipeline automation, it should work fine and run. You can check the automated MNIST application through FE (localhost:4321).

MNIST datasets

https://www.kaggle.com/datasets/scolianni/mnistasjpg/

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
be		be
docs/images		docs/images
fe		fe
ml		ml
mlflow-server		mlflow-server
prometheus		prometheus
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MNIST-operation-pipeline

Environment

Quick Start

Warning

Run Docker

Process of what you have to do first

Continuous Training (CT)

Architecture

Links

The points you should know

MNIST datasets

About

Releases

Packages

Contributors 3

Languages

binarybamboo/MNIST-operation-pipeline

Folders and files

Latest commit

History

Repository files navigation

MNIST-operation-pipeline

Environment

Quick Start

Warning

Run Docker

Process of what you have to do first

Continuous Training (CT)

Architecture

Links

The points you should know

MNIST datasets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages