-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency cleanup #1372
base: dev
Are you sure you want to change the base?
Dependency cleanup #1372
Conversation
Side-note: wrt your comment about speeding up the dev cycle reminds me that I wanted to add lines to our Dockerfiles that will cache Python packages on the host system. I just had some trouble getting it to work in GH CI and then got distracted. Thanks for doing this cleanup though, I will take a deeper look tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a double check:
jinja
could be used by Flask (though I don't think we use its templating functionality), but I imagine it will install it, so probably good to gomatplotlib
no references, good to gopytest-check
I added this during JIT dev and unittest->pytest migration, but it's fine to delete; can bring back when we actually use itsas7bdat
seems like a library for loading SAS files, might be used bypandas.read_sas
and not be bundled in by default, but that function isn't used anywhere in the repo so I think this is fine to goscipy
a bulky one that would be pretty obvious if we used it somewhere, so good to gosqlalchemy-stubs
seems like an extension for the mypy type checker that's good to, if someone wants them back they can easily put them in the dev filexlrd
pretty sure this is used bypandas.read_excel
depending on the Excel file version and not bundled in Pandas by default; there is one use of that function inquidel.py
, so I would keep this one
I'd be open to experimenting with "compatible release" on a few packages to start. It's asking us to trust the maintainers of our dependencies to not make breaking changes and we can work to build that trust slowly. For safety, we could log our dependency versions (for compatible release packages specifically) somewhere internally in our code's metadata and dump it on error, so we can easily see it in a Sentry error.
@melange396 Yep, that's right. Essentially manually (sometimes handled by an Ansible playbook that governs the overall system setup) and run by a real human to update. |
Good call regarding
I think that could be a great debugging tool, i mentioned it in #987 (comment) |
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
Ah yea, the main culprit for the bulky dependencies is
I tried the same with Flask and it auto-pulled in jinja, so that explains the last bit. |
Removed because they werent directly referenced anywhere in the code of this repo:
jinja
&matplotlib
&pytest-check
&sas7bdat
&scipy
&sqlalchemy-stubs
&xlrd
Removed from one file because it was duplicated in both:
delphi_utils
&tenacity
Moved to the "dev" file cuz theyre seemingly not actually used by the api:
more_itertools
&requests
Moved to the "api" file so its with pandas which pulls it in as a dependency anyway (and it is explicitly used in server code (though its just one place)):
numpy
Id hoped that cleaning this stuff up would reduce the build time of docker images and thus speed up the dev cycle, as well as shrink the disk/file footprint size of the images. That turned out to not be the case because many of the removed packages still get referenced somewhere else in the dependency tree, and they get included anyway. At least we can pass off the work of maintaining the version requirements to other packages. Plus, #1308 will probably have a bigger impact than this could have dreamed of.
I am also strongly considering loosening some of the versions by using the "compatible release clause" (aka the
~=
operator) wherever we currently use a hard-equal 3-level version (like==1.2.3
) so we can get security updates and bug fixes for ~free without (🤞) breaking things. Discussion and commentary on that is very much welcomed.@korlaxxalrok: Do you remember off the top of your head how package installation gets sorted out by jenkins / g-d-r / automation? Is it done manually? We run the acquisition code on bare hardware, instead of using the bundled containers that manage the dependencies, and i dont see an obvious mechanism that would do it automatically.