Skip to content

Latest commit

 

History

History
51 lines (34 loc) · 2.03 KB

README.md

File metadata and controls

51 lines (34 loc) · 2.03 KB

DVC Pipeline for Pokémon type classifier

This DVC pipeline trains a CNN to classify images of Pokémon. It will predict whether a Pokémon is of a predetermined type (default: water).

Note: due to the limited size of the dataset, the evaluation dataset is the same data set as the train+test. Take the results of the model with a grain of salt.

From Notebook to pipeline

This project details the transformation from Notebook to DVC pipeline. In the different branches, you can find three stages in this process:

Requirements

How to run

  1. Create a new virtual environment with virtualenv -p python3 .venv

  2. Activate the virtual environment with source .venv/bin/activate

  3. Install the dependencies with pip install -r requirements.txt

  4. Download the datasets from Kaggle into the data/external/ directory.

    $ wget https://www.kaggle.com/datasets/robdewit/pokemon-images -o data/external/pokemon-gen-1-8
    $ wget https://www.kaggle.com/datasets/rounakbanik/pokemon -o data/external/stats/pokemon-gen-1-8.csv
  5. Run the pipeline with dvc repro or run an experiment with dvc exp run

Notes on hardware

The requirements specify tensorflow-macos and tensorflow-metal, which are the appropriate requirements when you are using a Mac with an M1 CPU or later. In case you are using a different system, you will need to replace these with tensorflow.