Skip to content

Commit

Permalink
move some workflow_tests to datasci
Browse files Browse the repository at this point in the history
  • Loading branch information
dafeliton committed Jul 23, 2024
1 parent b431dbe commit 130bc60
Show file tree
Hide file tree
Showing 12 changed files with 22 additions and 8 deletions.
2 changes: 1 addition & 1 deletion Documentation/actions.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# DataHub Docker Stack: GitHub Actions

The images used to be built and pushed to [our organization at DockerHub](https://hub.docker.com/orgs/ucsdets/members) through GitHub Actions, but are now published as packages within this repo instead. We also use GitHub Actions for testing and pushing our stable images to production. [You may also check out scripts.md](/Documentation/scripts.md) for a more indepth look at the Python code underlying these actions.
The images used to be built and pushed to [our organization at DockerHub](https://hub.docker.com/orgs/ucsdets/members) through GitHub Actions, but are now published as packages within this repo instead. We also use GitHub Actions for testing and pushing our stable images to production. [You may also check out scripts.md](/Documentation/scripts.md) for a more in-depth look at the Python code underlying these actions.

We have four actions that we use to develop, test, and deploy our Docker Stack.

Expand Down
10 changes: 8 additions & 2 deletions Documentation/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ to run the pipeline. For testing, we use pytest.
│   │   ├── Dockerfile # image definition for docker
│   │   ├── scripts # .sh & .py scripts used for container setup
│   │   │   └── ...
│   │   ├── workflow_tests
│   │   ├── test_matplotlib.py
│   │   ├── test_nltk.py
│   │   ├── test_pandas.py
│   │   └── test_statsmodels.py
│   │   └── test # image acceptance tests
│   │      ├── data
│   │      │   └── test-notebook.ipynb
Expand All @@ -77,16 +82,17 @@ to run the pipeline. For testing, we use pytest.
│   │   ├── activate.sh
│   │   ├── cudatoolkit_env_vars.sh
│   │   ├── cudnn_env_vars.sh
│   │   ├── run_jupyter.sh
│   │   ├── manual_tests
│   │   │   ├── pytorch_mtest.ipynb
│   │   │   └── tensorflow_mtest.ipynb
│   │   ├── run_jupyter.sh
│   │   ├── test
│   │   ├── old_tests
│   │   │   ├── __init__.py
│   │   │   ├── data
│   │   │   │   └── test_tf.ipynb
│   │   │   └── test_tf.py
│   │   └── workflow_tests
│   │   ├── test_keras.py
│   │   ├── test_pytorch.py
│   │   └── test_tf.py
│   ├── spec.yml # image definition metadata (for all images)
Expand Down
4 changes: 4 additions & 0 deletions images/datascience-notebook/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ RUN mkdir /opt/manual_tests
COPY /test/test_r_dump_packages.R /opt/manual_tests
COPY /test/test_r_func.R /opt/manual_tests

# Add additional tests
RUN mkdir -p /opt/workflow_tests
COPY workflow_tests/* /opt/workflow_tests

USER jovyan

# Python/Mamba Deps
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ def setup_module(module):
nltk.download('punkt', download_dir='/tmp/nltk_data')
nltk.download('maxent_ne_chunker', download_dir='/tmp/nltk_data')
nltk.download('words', download_dir='/tmp/nltk_data')
nltk.download('averaged_perceptron_tagger', download_dir='/tmp/nltk_data')
nltk.data.path.append('/tmp/nltk_data')

def test_tokenization():
Expand Down
5 changes: 4 additions & 1 deletion images/scipy-ml-notebook/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,10 @@ RUN chmod +x /run_jupyter.sh
# Scripts setup
COPY cudatoolkit_env_vars.sh cudnn_env_vars.sh tensorrt_env_vars.sh /etc/datahub-profile.d/
COPY activate.sh /tmp/activate.sh
COPY workflow_tests /opt/workflow_tests

# Add tests
RUN mkdir -p /opt/workflow_tests
COPY workflow_tests/* /opt/workflow_tests
ADD manual_tests /opt/manual_tests

RUN chmod 777 /etc/datahub-profile.d/*.sh /tmp/activate.sh
Expand Down
File renamed without changes.
File renamed without changes.
8 changes: 4 additions & 4 deletions images/scipy-ml-notebook/workflow_tests/test_pytorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def length_of_dataset_no_cuda():

# Download and load the training data
train_data = datasets.MNIST(
root='./data', train=True, download=True, transform=transform)
root='/tmp', train=True, download=True, transform=transform)

# Check the size of the training set
ld = len(train_data)
Expand All @@ -131,9 +131,9 @@ def mean_pixel_value_cuda():
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_set = datasets.MNIST(
root='./data', train=True, download=True, transform=transform)
root='/tmp', train=True, download=True, transform=transform)
test_set = datasets.MNIST(
root='./data', train=False, download=True, transform=transform)
root='/tmp', train=False, download=True, transform=transform)

# Move dataset to device
train_loader = torch.utils.data.DataLoader(
Expand Down Expand Up @@ -171,7 +171,7 @@ def multiply_dataset_calculate_mean_cuda():
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_dataset = datasets.MNIST(
'./data', train=True, download=True, transform=transform)
'/tmp', train=True, download=True, transform=transform)

# Create a DataLoader for the dataset
train_loader = torch.utils.data.DataLoader(
Expand Down

0 comments on commit 130bc60

Please sign in to comment.