Jupyter Notebook support through the GUI #63

mohammedri · 2020-03-13T15:34:17Z

As a user I should be able to go to the GUI, click on create a notebook, specify the amount of GPU and RAM the notebook should use.

If those resources are available, Atlas will start a Jupyter notebook server (as a job?).
This will perpetually be running unless stopped.

Once the notebook is stopped, there should be a way to resume the notebook again through the GUI. Ideally you should be able to use all the Atlas SDK functions through the Jupyter Notebook.

This task requires significant UX research, this task is to understand the user flow and create mockups.

mohammedri · 2020-03-13T15:34:31Z

From @ekhl
Spike:

Build the following docker image:

FROM jupyter/tensorflow-notebook

USER root

RUN conda install --yes \
    promise \
    redis-py \
    slackclient && \
    conda clean -all -f -y && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

COPY run.sh /usr/local/run.sh

ENTRYPOINT ["tini", "--", "/usr/local/run.sh"]

and have run.sh as follows:

#!/bin/bash

sed -i '10i\ \ \ \ return' /job/foundations/__init__.py

python -c "import foundations; foundations.set_tag(\"Jupyter server\")"

jupyter notebook --allow-root $@

Build with docker build -t jupyter-spike ., make sure run.sh is in .

have atlas-server running
create a new project folder with the following job.config.yaml

log_level: INFO
worker:
  image: jupyter-spike
  ports:
    8888: 8888

Modify local (submitter side) foundations_local_docker_scheduler_plugin.job_deployment, approx at line 313 to have ports as an override configuration, i.e.:

        for override_key in ['command', 'image', 'working_dir', 'entrypoint']:

to

        for override_key in ['command', 'image', 'working_dir', 'entrypoint', 'ports']:

(optional) create a notebook in the project directory
start the jupyter notebook server job via
foundations submit scheduler .
follow the logs and open the notebook via your browser (should be a link with a token)
import foundation as usual, try and log metrics, etc.

Note that that there is a jupyter server tag by default

mohammedri · 2020-03-13T15:34:56Z

From @ekhl

Spike on introducing two SDK functions for user to control job start and end

Create the following as a module (say f9s.py):

import os

def start_new(job_id=None, stop_if_running=True):
    import foundations.local_run as lr
    from foundations_contrib.global_state import current_foundations_context
    
    pipeline_context = current_foundations_context().pipeline_context()
    
    try:
        cur_job_id = pipeline_context.file_name
    except ValueError:
        cur_job_id = None
    
    if cur_job_id is not None:
        # job has already started
        if stop_if_running:
            stop()
        else:
            raise Exception("Job is already running")
    
    env_cmd = os.environ.get('FOUNDATIONS_COMMAND_LINE', None)
    os.environ['FOUNDATIONS_COMMAND_LINE'] = "False"
    
    env_job_id = os.environ.get('FOUNDATIONS_JOB_ID', None)
    del os.environ['FOUNDATIONS_JOB_ID']
    if job_id is not None:
        os.environ['FOUNDATIONS_JOB_ID'] = job_id
    
    lr.set_up_default_environment_if_present()
    
    if env_cmd is not None:
        os.environ['FOUNDATIONS_COMMAND_LINE'] = env_cmd
        
    if env_job_id is not None:
        os.environ['FOUNDATIONS_JOB_ID'] = env_job_id
    else:
        del os.environ['FOUNDATIONS_JOB_ID']

def stop():
    import foundations.local_run as lr
    from foundations_contrib.global_state import current_foundations_context
    
    pipeline_context = current_foundations_context().pipeline_context()

    try:
        lr._at_exit_callback()
    except ValueError:
        pass
    
    try:
        pipeline_context.file_name = None
    except ValueError:
        pass

def current_job_id():
    from foundations_contrib.global_state import current_foundations_context
    
    pipeline_context = current_foundations_context().pipeline_context()
    
    try:
        return pipeline_context.file_name
    except ValueError:
        return None

Then import in notebook:

from f9s import start_new, stop
import foundations

for i in range(10):
    start_new()
    foundations.log_metric('Job', i)
    foundations.save_artifact('f9s.py')
stop()

mohammedri · 2020-04-28T20:58:01Z

@shazraz since you have the most context on the asks for this - do you mind adding your thoughts on what the user flow should be like?

shazraz · 2020-04-29T19:27:19Z

I think providing SDK functions to start/stop a job (as @ekhl investigated above) or using a context manager would be good in a notebook. This effort is about providing experiment tracking features via a notebook interface as opposed to the title of the ticket (notebook support through GUI).

Here's my digression:
I think what's needed is a larger evaluation of how we want to evolve execution mode. e.g. if we provide these start/stop SDK functions, should we still create jobs when running scripts from the command line?

ekhl · 2020-04-29T21:41:07Z

@shazraz @mohammedri bringing this conversation about tracking functionality to #131

mohammedri added feature-request Use this label to indicate a feature request ux-research spike labels Mar 13, 2020

ekhl mentioned this issue Apr 29, 2020

Providing user control over scope of experiment tracking functionality #131

Open

mohammedri added the needs design A design is needed in order to proceed label Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jupyter Notebook support through the GUI #63

Jupyter Notebook support through the GUI #63

mohammedri commented Mar 13, 2020

mohammedri commented Mar 13, 2020

mohammedri commented Mar 13, 2020

mohammedri commented Apr 28, 2020

shazraz commented Apr 29, 2020

ekhl commented Apr 29, 2020

Jupyter Notebook support through the GUI #63

Jupyter Notebook support through the GUI #63

Comments

mohammedri commented Mar 13, 2020

mohammedri commented Mar 13, 2020

mohammedri commented Mar 13, 2020

Spike on introducing two SDK functions for user to control job start and end

mohammedri commented Apr 28, 2020

shazraz commented Apr 29, 2020

ekhl commented Apr 29, 2020