Welcome to the workshop!
Welcome to the first HDS-LEE student organised Workshop! Goal of the
Workshop is to give an introduction and overview of different workflow
tools for Data Science and Machine Learning pipelines. The Workshop
focuses primarily on the MLflow package.
The materials will we updated after the course.
Including some additional information.
To interactively work with the materials, you can open this notebook in google colab. All you need is a google account. Besides the server application, all course materials are prepared for direct use in google colab. No local installations are required. In the readme and during the course, we will provide you with an additional how-to for local or remote installations.
To allow interactions during the workshop and to provide a realistic server setup for labs or industrial use-cases, we will use a cloud-hosted storage and mlflow server. Both are protected. Every participant will receive his/her own credentials for the mlflow server via mail beforehand. The credentials are used to avoid collisions between runs so please use your own credentials. You should have received:
- MLFLOW_TRACKING_USERNAME
- MLFLOW_TRACKING_PASSWORD
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- Table of contents
- Preparation
- Opening the notebooks in Colab
- Purpose of this repository
- Team
- Presentation and Hands-on Training
- Organisation and Links
- Technical Support and Questions Training](#presentation-and-hands-on-training)
June 16th, 2021 from 10:00 to 15:00 CEST.
10:00 - 10:15 | Welcome and Introduction |
10:15 - 11:00 | Presentation "Workflow Tools: MLflow, DVC and Apache Airflow" |
11:00 - 12:00 | Hands-on Session: Introduction and Code Tour |
12:00 - 13:00 | Lunch |
13:00 - 14:30 | Hands-on Session: Group work |
14:30 - 15:00 | Wrap-up and Feedback |
[PDF Schedule](./Schedule HDS-LEE Workshop 2021 - Workflow Tools.pdf)
Online (Zoom). Link to follow in email, else contact Ramona Kloß.
To ensure the quality and experience of the planned group work segments, please fill out the form on located at https://docs.google.com/spreadsheets/d/13dDIkX1eO34eneFCOz5f2r3kiD3d-Jz9gG-AAFd8PxE/edit?usp=sharing.
Please ensure that you have a Google account ready to work with Colab. This will ensure that you can follow along the in the hands-on session.
Link to MLflow server frontend.
An account to the MLflow server will be made for each participant and the details emailed. Please check that the login works before the course. (Should something not work, please contact [email protected]).
Go to https://colab.research.google.com, and click on 'File' and then 'Open Notebook'.
In the address field, enter the link to the (first) notebook hosted on GitHub: (https://github.com/ChristianGerloff/hida-workshop-mlflow/blob/tracking/notebooks/HIDA_Workshop_MLOps_Tracking_Session_1.ipynb).
After entering the whole URL to the notebook (including filetype), click on the looking glass symbol or hit 'enter' to open the Notebook.
To enable saving your progress and changes, save a copy to your Google Drive (or your own GitHub account) using the 'File' dialogue menu.
This repository is to house the Jupyter Notebooks that contain practical examples of implementing an MLflow workflow, thus serving as a reference after the fact.
- Christian Gerloff
- Johannes Kruse
- Dr. Ramona Kloß
- Emile de Bruyn