-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Adds support for lazy and optional importing of dependencies #1415
Conversation
- require scipy and matplotlib in setup.py (fixes #1361) - scipy and matplotlib are imported at top in analysis - updated all modules - removed any code that guards against scipy or matplotlib import - conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis - fixes #1159 - removed conditional skipping of tests when scipy or matplotlib are missing
- travis.yml - docs, code style in analysis and analysis tests
del msg | ||
# Optional and/or lazily imported modules | ||
from MDAnalysis.lib import lazy | ||
scipy = lazy.import_module('scipy.sparse', level='base') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only load scipy.sparse and nothing else from scipy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be equivalent to
import scipy.sparse
in that it results on the 'scipy' name being added to the namespace.
On module access both scipy
and scipy.sparse
are fully imported.
The level='base'
keyword tells the function to return a reference to the base module (scipy
). From then on you use scipy.sparse.whatever
, just like it had been imported with import
.
The difference to the default import_module
call (without level='base'
) is that in the default case the returned reference is to the submodule. In that case you use sparse = import_module('scipy.sparse')
, as equivalent to from scipy import sparse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fancy! The only thing that annoys me a little is that you can end up with a calculation failing at the end because something is missing and you were not warn ahead of time.
|
||
|
||
def _import_module(modname, caller_name): | ||
acquire_lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a context manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context manager already exists, but inside _imp
. I'd rather keep it high-level and not use _imp
, but I already have to anyway for Python 3 support...
So I guess the answer is yes :)
@jbarnoud, yes, that is a possibility. Right now I don't think any of the analyses commits to heavy calculations before using optional dependencies. |
This introduces an odd behaviour for a python user. Maybe it should be documented somewhere; at least in the developer page of the wiki. When should we merge this? It is going to introduce conflicts in a bunch of test files and in |
I opened the PR more as a request-for-comments. I think its best to bring it in only after @utkbansal's GSoC effort, no? As to the impact to the end-user, I'm not so sure. This is internally hidden, and just as before some dependency errors/warnings only show up when executing analysis code, not at import-time. OTOH, I should definitely write up something for would-be contributors of code, as this would be a new guideline for using optional dependencies. |
There's a potentially neat aspect: This means that we can condense the calls to all those optional/lazy dependencies in a single file (say, However, a drawback is that functions cannot be preloaded: issuing |
This addresses issues #577, #1361 and #1159. I have rebased against @orbeckst's branch
issue-1159-analysis-deps
, not develop, hence the conflicts.Namely, it provides an easy way to import modules that are only really loaded on accession. This allows one to:
Mechanism
My strategy was to adapt the code from the
importing
module from the PEAK package. On lazy loading a dummy, hollow module class is created and instantiated that triggers a proper module import on accession.API
Right now two functions are exposed:
lazy.import_module
andlazy.import_function
that emulate the typical uses ofimport
andfrom import
. I'd like feedback on this.Here's how it now looks in
analysis/psa.py
, which uses several lazy/optional imports:Also, as it is now lazy modules lack the option to customize an error message, spitting out a generic
For this reason I left the netcdf4 optional dependency untouched, because the error message is quite informative installation-wise. But something to add if this PR gets accepted.
Timing
Since I was developing this alongside @orbeckst's take on #1159 (PR #1404) I took the initiative to also make matplotlib and scipy lazy imports.
I just did some crude timing (100 repeats using UNIX's
time
) and found that on my machine lazy loading of scipy saves a grand total of 5ms when importing theMDAnalysis.analysis.distances
submodule (on top of ~250ms taken to loadMDAnalysis
). This shows that point 1. above might not be that relevant.Testing
Added tests for both lazy loads and missing dependencies (making good use of @richardjgowers's
block_import
), but still innose
format. Since this adds stuff toMDAnalysisTests/lib
, which @utkbansal already took care of (#1413), it might be best to adapt topytest
directly before merging.