Skip to content

Latest commit

 

History

History
52 lines (36 loc) · 1.59 KB

1-local-reproducibility.md

File metadata and controls

52 lines (36 loc) · 1.59 KB

Local reproducibility

We have a DVC Pipeline defined in dvc.yaml file.

The pipeline is composed of stages using Python scripts, defined in src:

flowchart TD
        node2[eval]
        node3[get-data]
        node4[split-data]
        node5[train]
        node3-->node4
        node4-->node2
        node4-->node5
        node5-->node2
Loading

We use DVC Params, defined in params.yaml, to configure the pipeline.

The pipeline enables local reproducibility and can be run with dvc repro:

git clone git@[email protected]:iterative/workshop-uncool-mlops.git
cd workshop-uncool-mlops
pip install -r requirements.txt
dvc repro

The pipeline generates DVC Metrics and DVC Plots to evaluate model performance, which can be found in outs

You can connect the repo with https://studio.iterative.ai/ in order to have a better visualization for the metrics, parameters and plots associated to each commit:

https://studio.iterative.ai/user/daavoo/views/workshop-uncool-mlops-5fgmd70rkt

Because the metrics and plots files are small enough to be tracked by git, after we run the pipeline we can share the results with others:

git add `dvc.lock` outs
git push

However, the rest of the outputs are gitignored because they are too big to be tracked by git.

Bigger Boat