Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI/Docker build: faster or parallel #286

Open
bertsky opened this issue Feb 5, 2022 · 2 comments
Open

CI/Docker build: faster or parallel #286

bertsky opened this issue Feb 5, 2022 · 2 comments

Comments

@bertsky
Copy link
Collaborator

bertsky commented Feb 5, 2022

In maximum-cuda, we recently broke the 1h build time boundary for CircleCI free accounts. Hence the attempt to use make -j again to reduce it (the VMs are multi-core, so this should work in principle).

But it seems that we have a race condition which has not been detected yet:

FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/sub-venv/headless-tf2/bin/pip' -> '/tmp/pip-0dn813kj-uninstall/usr/local/sub-venv/headless-tf2/bin/pip'

This looks like multiple pip calls on the same venv clash. Maybe we need to synchronize these calls in the same way we do for git calls.

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 23, 2022

I'm not sure #287 actually does what we want. This looks like we still have races:

sem --will-cite --fg --id ocrd_all_pip/usr/local/sub-venv/headless-tf1 python3 -m venv /usr/local/sub-venv/headless-tf1
sem --will-cite --fg --id ocrd_all_pip/usr/local/sub-venv/headless-tf1 python3 -m venv /usr/local/sub-venv/headless-tf1
The virtual environment was not created successfully because ensurepip is not
available.  On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.

    apt-get install python3-venv

You may need to use sudo with that command.  After installing the python3-venv
package, recreate your virtual environment.

Failing command: ['/usr/local/sub-venv/headless-tf1/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']

Makefile:152: recipe for target '/usr/local/sub-venv/headless-tf1/bin/activate' failed
make[1]: *** [/usr/local/sub-venv/headless-tf1/bin/activate] Error 1
make[1]: Leaving directory '/build'

Or in another variant:

Successfully built ocrd-cor-asv-ann gast termcolor
Installing collected packages: numpy, protobuf, markdown, h5py, grpcio, absl-py, termcolor, tensorflow-estimator, tensorboard, scipy, python-dateutil, pyparsing, opt-einsum, kiwisolver, keras-preprocessing, keras-applications, google-pasta, gast, cycler, astor, tensorflow-gpu, matplotlib, keras, editdistance, ocrd-cor-asv-ann
  Attempting uninstall: numpy
    Found existing installation: numpy 1.19.5
    Uninstalling numpy-1.19.5:
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/usr/local/sub-venv/headless-tf1/bin/f2py'

Makefile:241: recipe for target '/usr/local/sub-venv/headless-tf1/bin/ocrd-cor-asv-ann-evaluate' failed
make[1]: Leaving directory '/build'
make[1]: *** [/usr/local/sub-venv/headless-tf1/bin/ocrd-cor-asv-ann-evaluate] Error 1
Makefile:236: recipe for target '/usr/bin/ocrd-cor-asv-ann-evaluate' failed
make: *** [/usr/bin/ocrd-cor-asv-ann-evaluate] Error 2

Or yet another:

Successfully installed ocrd-validators-2.30.0
Obtaining file:///build/core/ocrd
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/local/sub-venv/headless-tf1/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/build/core/ocrd/setup.py'"'"'; __file__='"'"'/build/core/ocrd/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-7cvua7u9
       cwd: /build/core/ocrd/
  Complete output (5 lines):
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/build/core/ocrd/setup.py", line 3, in <module>
      from ocrd_utils import VERSION
  ModuleNotFoundError: No module named 'ocrd_utils'
  ----------------------------------------
WARNING: Discarding file:///build/core/ocrd. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Makefile:72: recipe for target 'install' failed
make[2]: Leaving directory '/build/core'
make[2]: *** [install] Error 1
Makefile:170: recipe for target '/usr/local/sub-venv/headless-tf1/bin/ocrd' failed
make[1]: *** [/usr/local/sub-venv/headless-tf1/bin/ocrd] Error 2
make[1]: Leaving directory '/build'

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 24, 2022

The above was from three different variants of our CI. I am also seeing strange effects in native installations – if I enable -j, not all modules will be built.

Until we know what's going on, we should perhaps disable parallel build (and hope we are still under 1h due to #287 speedups).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant