-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: draft of a minimal "developer" base image #346
base: master
Are you sure you want to change the base?
Conversation
Anybody please feel free to jump in with any comments or advice |
terra-jupyter-dev-base/Dockerfile
Outdated
&& apt-get update && apt-get install -yq --no-install-recommends \ | ||
sudo \ | ||
&& sudo -i \ | ||
echo "deb http://security.ubuntu.com/ubuntu/ bionic main" >> /etc/apt/sources.list \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bionic
is code for 18.04. Is this supposed to be Focal Fossa
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: I am not sure about this. I copied it directly from terra-jupyter-base
. I'm honestly not sure what the base operating system actually is in terra-jupyter-base
, since it uses a google deep learning base image gcr.io/deeplearning-platform-release/tf-gpu.2-7
, and I can't seem to find the details.
The only thing I know is that, if I remove these steps from the build, it doesn't work. These commands seem to be necessary for successful install of libexempi3
and libv8-3.14-dev
(which in turn are only needed in order to install the python packages firecloud
and terra-notebook-utils
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't mean to delete these. I mean for 20.04, shouldn't there be a new name other than bionic
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, yeah, you could be right. I'm not sure. Somehow this works as is. Is it maybe a workaround to install bionic
packages in later ubuntu releases? Or maybe it does need to be updated from bionic
to focal fossa
here. I would need to ask the person who wrote this part :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's now focal
replacing bionic
for 20.04.
terra-jupyter-dev-base/Dockerfile
Outdated
# Install jupyter and some necessary python packages | ||
&& pip3 -V \ | ||
# For gcloud alpha storage support. | ||
&& pip3 install google-crc32c --target /usr/lib/google-cloud-sdk/lib/third_party \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned about mixing pip and conda, but I know this existed before this PR.
terra-jupyter-dev-base/Dockerfile
Outdated
# When we upgraded from jupyter 5.7.8 to 6.1.1, we broke terminal button on terra-ui. | ||
# Hence, make sure to manually test out "launch terminal" button (the button in the green bar next to start and stop buttons) | ||
# to make sure we don't accidentally break it every time we upgrade notebook version until we figure out an automation test for this | ||
&& pip3 install notebook \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to pin down the versions? And/or have a requirements.txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requirements.txt would be a nice addition!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably makes sense to pin them
I thikn you need https://docs.google.com/document/d/1b9uXanA3uCxUJoKktczxFXpJ1o3TYeXnaM0PGvbAczM/edit?disco=AAAAdToq5Wg still (those scripts) |
@Qi77Qi is that something other than these? (I didn't see any other scripts being added to |
@jdcanas Do you think I'm ready to try that out? Is that the next step? |
@jdcanas Okay I did try it out ... I created a Terra Compute Environment using this image.
The google bucket for the workspace, in the If anybody else wants to test out the image (without having to build it themselves), it is currently hosted here
although I can't promise I won't over-write it after more changes and testing. |
If what @sjfleming experimented above is allowed, then this section on the readme needs to be updated as well https://github.com/databiosphere/terra-docker#terra-base-images |
Okay so I figured, while I'm at it, let's make an equivalent base image that enables GPU functionality and has all the CUDA stuff pre-installed. This might be out of scope for this current PR, but I'm going to include the Dockerfile here in case anyone wants to look at it. The size of the image is 2.63 GB. I have hosted it here for the time being, if anyone wants to play with it
Tested:
I always build GPU-enabled images from the Nvidia docker image catalog (i.e. NOTE: the two new images in this PR are identical except for the |
Is that actually working? you don't have Regardless tho, I think we're not planning to advertise these images for other users any time soon (if at all)...so it's really just for your own use case for now..If things are working for you, I don't mind merging them as is (adding a comment like "this image is not officially supported, use at your own risk" might be good in case others start using them). Bear in mind, it's additional work to add automation for building these 2 images, so you'll need to build them manually until we figure out if this is something we want to support |
Hi @Qi77Qi , yes, the reason I don't need to have the That sounds good to me! I'm sure building manually is fine for now. If anybody else ends up using these images in the future, then maybe automation could be revisited. |
ah I see..that makes sense...whenever you're ready, feel free to put this to non draft |
Okay thanks Qí! I will do a little bit of cleanup:
|
Turns out I was not including the google cloud SDK, but I realize that's 100% necessary. (I'm guessing it's part of the tensor flow base image, so I didn't realize I was missing it at first, since it's not explicit in the "base" Dockerfile.) That will increase the size of the image a bit. |
I am a bit confused about how cloud file syncing (syncing the notebook ipynb file to the project's google bucket) was working if my image did not have |
With Feel free to experiment with it here (I've renamed it)
|
And the GPU-enabled image is 2.02 GB It is hosted here temporarily
(Of course this doesn't have |
I installed |
It seems like
But then when I actually use the image to start a Cloud Environment in Terra, I see something different. It is always see the exact same
@Qi77Qi is this expected behavior? (I'm trying to fix a |
Tested the images, and they seem to work, but I might have found one issue: import os
os.environ['WORKSPACE_BUCKET'] I see Where / how does the WORKSPACE_BUCKET environment variable get set? Other environment variables all seem to be okay, for instance os.environ['WORKSPACE_NAMESPACE']
This is now ready for review @Qi77Qi @jdcanas There are two new images included in this PR:
each can be built using the |
Did this development die, or is it continuing elsewhere? Being able to start with a minimal Terra install for custom images would be quite useful. |
Hey @thouis , it certainly didn’t die from my perspective. But it looks like we haven’t made any progress toward merging this PR or making the minimal base images “official”. I’d still love to see it happen. Maybe your interest will help revitalize the effort. |
If you want, feel free to try
from this PR. That’s what I’m using now as the base for images I build for Terra. (As with any custom image, I find it’s necessary to extend the default “timeout” for Cloud Environment creation to much longer than 10 mins.) The only lingering problem is this (#346 (comment)) but I’ve been able to work around it. I think I would need help from the experts to solve the issue. |
I spent some time yesterday building an updated version of the gpu base
image (cuda 11.8 base image, python 3.8 - I needed 3.8 for tensorqtl). I
was able to use it for starting up an environment without increasing the
timeout (I've never figured out why this does or doesn't happen, either).
What is still missing for making these more official? Is #346 (comment) indicative
of deeper concerns?
Ray
|
Hi Ray, I would also like to update this to cuda 11.8 and python 3.8, since python 3.8 is now needed for pytorch 2.0+, which is the ecosystem I work in. It's been quite a while since I touched this. If you have made those changes, I'd be happy to merge them into my branch if you wanted them to become part of this PR. Or I could incorporate the changes on my side if that's easier. Was it just a matter of using a different base image and then using a different miniconda version? And everything else still worked? So the comment above is, I believe, an isolated issue that doesn't indicate deeper concerns. But that's me guessing! I've tested out other stuff pretty extensively, just by using it on Terra, and everything seems to work just fine. I think what's going on is that somewhere deep in the Leonardo or Welder codebase (which I'm just not familiar with at all), the environment variables in any image run on Terra are being overwritten, so that users can access things like the workspace name and the bucket. I have no idea why it's not working as expected here... and just the bucket environment variable. The workspace name environment variable still works! It really seems to just be the bucket. It does bug me though. But I can't figure it out myself. Somebody from the Interactive Analysis DSP team please correct me if I'm wrong! @rtitle do you know how these environment variables get set? |
My changes were: Use these instead: For cromshell, I had to change to this (tags have changed?): Also, just after installing Miniconda, I install mamba (much faster drop-in replacement for conda) And added this at the end (because mamba was complaining about compatible versions) Then I was able to use this to get torch: |
Okay thanks! I'll update this. I'm conflicted as to whether to pre-install But it probably doesn't hurt to have it. |
The reason to install right away is that it can be difficult or impossible
to install after other things get added to the conda environment. But I
understand the concern.
One could potentially use micromamba instead of conda (
https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html) if the
goal is minimalism, but I think that's a more fraught choice if the goal is
to make this a base image for other environments.
…On Thu, Jun 22, 2023 at 10:52 AM Stephen Fleming ***@***.***> wrote:
Okay thanks! I'll update this. I'm conflicted as to whether to pre-install
mamba. I have heard people like it a lot. Then again, I want to stick to
the philosophy of having this be as "minimal" as possible.
—
Reply to this email directly, view it on GitHub
<#346 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADTPU2TUJ2SQHP3ZJY7J5LXMRL2ZANCNFSM55WGRMVA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
* Python 3.10 and suggested updates from Ray Jones; install mamba * Bump to version 0.0.2 * Update google-cloud-cli version * Install compiled crcmod * Update package versions and install libarchive for mamba
…docker into sf_minimal_base
Alright @thouis , I have taken your advice, thank you for it. And actually I went ahead and upgraded to python 3.10 since that's what all the other Terra docker images are using now (I think). I did also install The images are currently hosted here, if anyone is interested:
sizes have increased slightly:
These images can successfully create a Terra Cloud Environment that seems to function normally (still barring the comment above about missing the environment variable WORKSPACE_BUCKET). |
Compressed sizes:
|
Make libmamba solver default for conda Update google-cloud-cli version Update miniconda version Install ipykernel in conda to enable multiple jupyter kernels Update CUDA to 12.2.2
Any status updates on merging If anything, we could probably modify Also, it was a bit of a treasure hunt to find this PR, since this info isn't in the Terra-Bio docs, AFAIK. |
@nick-youngblut I think the official effort toward making this happen has moved to this PR: |
This is a discussion of the general strategy toward resolving #333 .
Some preliminary discussion (design doc) is here
https://docs.google.com/document/d/1b9uXanA3uCxUJoKktczxFXpJ1o3TYeXnaM0PGvbAczM/edit#
What has been done:
./terra-jupyter-dev-base
to the repo that contains two files:Dockerfile
build_docker.sh
: you can run this script to build the image, which is calledterra-jupyter-dev-base:0.0.1
Results so far:
ubuntu:20.04
that can successfully run a Jupyter notebook server using the commanddocker run --rm -it -p 8888:8000 terra-jupyter-dev-base:0.0.1
Testing:
docker run --rm -it -p 8888:8000 terra-jupyter-dev-base:0.0.1
, then I canterra-jupyter-base
?