Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorchForce missing getNumGlobalParameters() method #44

Open
dominicrufa opened this issue Aug 26, 2021 · 15 comments
Open

TorchForce missing getNumGlobalParameters() method #44

dominicrufa opened this issue Aug 26, 2021 · 15 comments
Labels
enhancement New feature or request

Comments

@dominicrufa
Copy link

I noticed that while the TorchForce supports addGlobalParameter, it doesn't have getNumGlobalParameters or getGlobalParameterName. Is it possible these could be added? or is there a way to wrap this TorchForce into a CustomCVForce so that I have access to these methods?

@peastman
Copy link
Member

What do you mean? Those methods are there:

/**
* Get the number of global parameters that the interaction depends on.
*/
int getNumGlobalParameters() const;

And also exposed in the Python wrapper:

int getNumGlobalParameters() const;

@dominicrufa
Copy link
Author

you're right. I must be running into a different issue downstream. sorry.

@dominicrufa
Copy link
Author

correction: it seems that getNumGlobalParameters method is missing after the Force is added to a System object.

@dominicrufa dominicrufa reopened this Aug 26, 2021
@peastman
Copy link
Member

Since TorchForce isn't part of the main OpenMM package, getForce() doesn't know what Python class to create for it, so you just get a generic Force object. You can use TorchForce.isinstance() to check if it's really a TorchForce, and TorchForce.cast() to get a proper Python object for it.

@dominicrufa
Copy link
Author

is there a way to make this consistent within OpenMM so that we don't have to be hacky with this object? and so it works seamlessly with repos like openmmtools. @jchodera , i think we might be keen on this functionality.

@peastman
Copy link
Member

peastman commented Sep 3, 2021

Unfortunately, no. This is a result of how SWIG works. If a C++ function returns an abstract supertype, and you want it to create a Python object of the appropriate subtype, you have to give it a complete list of all allowed subtypes at build time.

@jchodera
Copy link
Member

jchodera commented Sep 4, 2021

Since TorchForce isn't part of the main OpenMM package, getForce() doesn't know what Python class to create for it, so you just get a generic Force object. You can use TorchForce.isinstance() to check if it's really a TorchForce, and TorchForce.cast() to get a proper Python object for it.

Oh no, this is terrible. This is going to cause all manner of problems for Python applications using OpenMM and these new ML options.

@peastman : What are our options here? There must be some way to make this work without requiring all users to do crazy and highly-surprising non-idiomatic python gymnastics.

  1. We could pull this into the OpenMM codebase (which could be an anchor feature of OpenMM 8, focusing on QML support?)
  2. We could add a Python version of the System.getForce() method in extend.i that will loop through all in-scope subclasses of Force (Python 3 has a .__subclass__() method that should do this for anything that inherits from object, the default in Python 3), do this check for each, and construct and return the appropriate subclass instead.
  3. We could tell Swig about some of the subclasses it may encounter from plugins at compile time?
  4. We could ditch Swig in favor of a more modern C++-to-Python API

@peastman
Copy link
Member

peastman commented Sep 5, 2021

We could pull this into the OpenMM codebase (which could be an anchor feature of OpenMM 8, focusing on QML support?)

That would require making PyTorch into a dependency of OpenMM. Not going to happen!

We could tell Swig about some of the subclasses it may encounter from plugins at compile time?

That's impossible. It can only handle subclasses that are defined at compile time. If you try to include ones from external plugins, you'll get a compile error.

We could ditch Swig in favor of a more modern C++-to-Python API

Reimplementing the Python wrappers from scratch is also not going to happen. That would be a huge task. And why do think any other wrapper generator wouldn't have the same problem? This is intrinsic in the disconnect between C++ classes and Python classes. How those Python classes were created isn't the issue.

We could add a Python version of the System.getForce() method in extend.i that will loop through all in-scope subclasses of Force

This is the closest to a viable solution, but it wouldn't be robust. Let me demonstrate.

>>> import openmm
>>> openmm.Force.__subclasses__()
[<class 'openmm.openmm.RPMDMonteCarloBarostat'>, <class 'openmm.openmm.CustomTorsionForce'>, <class 'openmm.openmm.AmoebaWcaDispersionForce'>, <class 'openmm.openmm.CustomCVForce'>, <class 'openmm.openmm.AmoebaGeneralizedKirkwoodForce'>, <class 'openmm.openmm.DrudeForce'>, <class 'openmm.openmm.CustomCentroidBondForce'>, <class 'openmm.openmm.RBTorsionForce'>, <class 'openmm.openmm.AmoebaMultipoleForce'>, <class 'openmm.openmm.AmoebaVdwForce'>, <class 'openmm.openmm.RMSDForce'>, <class 'openmm.openmm.CustomExternalForce'>, <class 'openmm.openmm.CMAPTorsionForce'>, <class 'openmm.openmm.PeriodicTorsionForce'>, <class 'openmm.openmm.CustomHbondForce'>, <class 'openmm.openmm.CustomManyParticleForce'>, <class 'openmm.openmm.CustomGBForce'>, <class 'openmm.openmm.NonbondedForce'>, <class 'openmm.openmm.AndersenThermostat'>, <class 'openmm.openmm.CustomCompoundBondForce'>, <class 'openmm.openmm.CMMotionRemover'>, <class 'openmm.openmm.MonteCarloAnisotropicBarostat'>, <class 'openmm.openmm.HarmonicAngleForce'>, <class 'openmm.openmm.AmoebaTorsionTorsionForce'>, <class 'openmm.openmm.HarmonicBondForce'>, <class 'openmm.openmm.CustomNonbondedForce'>, <class 'openmm.openmm.MonteCarloMembraneBarostat'>, <class 'openmm.openmm.CustomBondForce'>, <class 'openmm.openmm.MonteCarloBarostat'>, <class 'openmm.openmm.HippoNonbondedForce'>, <class 'openmm.openmm.GayBerneForce'>, <class 'openmm.openmm.CustomAngleForce'>, <class 'openmm.openmm.GBSAOBCForce'>]
>>> import openmmtorch
Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
>>> openmm.Force.__subclasses__()
[<class 'openmm.openmm.RPMDMonteCarloBarostat'>, <class 'openmm.openmm.CustomTorsionForce'>, <class 'openmm.openmm.AmoebaWcaDispersionForce'>, <class 'openmm.openmm.CustomCVForce'>, <class 'openmm.openmm.AmoebaGeneralizedKirkwoodForce'>, <class 'openmm.openmm.DrudeForce'>, <class 'openmm.openmm.CustomCentroidBondForce'>, <class 'openmm.openmm.RBTorsionForce'>, <class 'openmm.openmm.AmoebaMultipoleForce'>, <class 'openmm.openmm.AmoebaVdwForce'>, <class 'openmm.openmm.RMSDForce'>, <class 'openmm.openmm.CustomExternalForce'>, <class 'openmm.openmm.CMAPTorsionForce'>, <class 'openmm.openmm.PeriodicTorsionForce'>, <class 'openmm.openmm.CustomHbondForce'>, <class 'openmm.openmm.CustomManyParticleForce'>, <class 'openmm.openmm.CustomGBForce'>, <class 'openmm.openmm.NonbondedForce'>, <class 'openmm.openmm.AndersenThermostat'>, <class 'openmm.openmm.CustomCompoundBondForce'>, <class 'openmm.openmm.CMMotionRemover'>, <class 'openmm.openmm.MonteCarloAnisotropicBarostat'>, <class 'openmm.openmm.HarmonicAngleForce'>, <class 'openmm.openmm.AmoebaTorsionTorsionForce'>, <class 'openmm.openmm.HarmonicBondForce'>, <class 'openmm.openmm.CustomNonbondedForce'>, <class 'openmm.openmm.MonteCarloMembraneBarostat'>, <class 'openmm.openmm.CustomBondForce'>, <class 'openmm.openmm.MonteCarloBarostat'>, <class 'openmm.openmm.HippoNonbondedForce'>, <class 'openmm.openmm.GayBerneForce'>, <class 'openmm.openmm.CustomAngleForce'>, <class 'openmm.openmm.GBSAOBCForce'>, <class 'openmmtorch.TorchForce'>]

Scroll all the way to the right to see the difference. Until you import the Python module, Python doesn't know about the TorchForce class. If you load an XML file containing a TorchForce and call getForce(), Python still won't know what to do with it. This again is the disconnect between C++ and Python.

@jchodera
Copy link
Member

jchodera commented Sep 5, 2021

Thanks for the quick response! I don't think these are the blockers they might seem.

That would require making PyTorch into a dependency of OpenMM. Not going to happen!

PyTorch could be an optional dependency: You have to elect to install it in order to use the TorchForce plugin.
We already do this with CUDA: We link against it, but don't require either the CUDA drivers and (until recently) didn't require the CUDA toolkit or libraries to be present. If the plugin fails to load, it is simply unavailable as a platform.

I don't see a downside at all to this approach, since this is exactly the philosophy we've been using since the inception of OpenMM.

That's impossible. It can only handle subclasses that are defined at compile time. If you try to include ones from external plugins, you'll get a compile error.

I probably didn't explain this well, but I am suggesting adding additional Python code to extend.i that would run when system.getForces() was called that would just use pure Python to re-wrap the class as the appropriate subclass. While Swig needs the C++ classes around at compile time, the Python code that is executed after the Swig wrapper returns a general Force object would allow us to introduce the Python code that you suggested @dominicrufa use: First, it would get a list of all registered subclasses of Force in scope (using pure Python 3 features, with .__subclasses__()). Then, it would check if this is an instance of any of those. If it is, it would do the wrapping for you.

In essence, I'm suggesting we can just glue in the Python code you suggested @dominicrufa use so it's done transparently under the hood at the Python level in our Swig wrappers.

Scroll all the way to the right to see the difference. Until you import the Python module, Python doesn't know about the TorchForce class. If you load an XML file containing a TorchForce and call getForce(), Python still won't know what to do with it. This again is the disconnect between C++ and Python.

That's right---you'd have to import the Python module before it would appear in scope. That's still much better than having to craft a bunch of code on your own to do this.

There may be a more clever way to figure out which plugins are present: Recall that we use the Python plugins discovery features to discover installed ffxml files---we could do something similar to discover which OpenMM plugins are installed in the same environment, and import them to discover all Force subclasses to make the support for proper Python subclass wrapping totally transparent to the user.

@peastman
Copy link
Member

peastman commented Sep 5, 2021

To make that work, we somehow need to bridge the C++ and Python elements of each plugin. When a C++ plugin is loaded, it needs to force the corresponding Python plugin to get loaded as well.

I've been trying to work out a clean way of doing it. I think something along these lines might work. Have a static registry of Python module names defined within the C++ library. When a plugin is loaded, it can optionally add the name of a Python module to the registry. Plugins get loaded by the __init__.py file for the openmm module. Immediately after loading them, it could check the contents of the registry and load any Python modules listed in it. That should ensure that any Force subclasses defined by plugins are discoverable by Python code.

@jchodera
Copy link
Member

jchodera commented Sep 5, 2021

To make that work

Which proposal, specifically?

My understanding was that migrating the plugin into the OpenMM branch as part of OpenMM 8 would not require anything special---just gracefully failing with an appropriate exception if someone tried to use TorchForce and the plugin library failed to load.

The proposal to use Python plugin discovery is clearly more complex, but sounds potentially workable. Given its complexity, though, perhaps it makes more sense to consider first migrating the plugin into OpenMM if it can be done in a clean way such that it clearly an optional dependency? This would just need:

  • Throw an appropriate exception (rather than segfault) if the library cannot be found
  • Have a mechanism to query the plugin load failures for more debugging info

@peastman
Copy link
Member

peastman commented Sep 5, 2021

PyTorch would become a compilation time dependency at the very least. That would be a maintenance nightmare. Dependency management in Python is a disaster and I go to great lengths to keep the number of dependencies for OpenMM to the absolute minimum. Requiring PyTorch even at build time is a non-starter.

If you haven't been paying attention to the chaos in the ML ecosystem, you may think I'm overreacting. I'm not. For several months it has been impossible to create a working Python environment containing current versions of both PyTorch and TensorFlow. PyTorch requires NumPy 1.20 or later. TensorFlow only works with 1.19 or earlier. Try to install both and fire comes out. This has been going on for months with no indication of when it will be fixed. It has required convoluted workarounds for packages like DeepChem or HuggingFace that require both PyTorch and TensorFlow. That in turn has required convoluted workarounds for every downstream package that uses them.

I wish I could say this was an exception, but it isn't. Weird breakages like this happen constantly. The only viable long term solution is to keep the dependencies for the core OpenMM package to only absolute essentials. That includes build time dependencies as well as runtime ones.

@jchodera
Copy link
Member

jchodera commented Sep 5, 2021

Oh, I've been paying attention, and it seems not (much) worse than the CUDA nightmares. We're struggling with some of those issues right now as part of the https://github.com/openkinome ecosystem, but things are slowly resolving as more of the conda-forge issues get worked out.

Despite the issues you note, it appears to be readily manageable if we wanted to pull the openmm-torch plugin into OpenMM:

  • Pull in the torch plugin, like the AMOEBA plugin
  • Control whether it's built with a CMake flag, just like the AMOEBA or CUDA plugins
  • Migrate CI to use conda-forge for our dependencies, like all our other projects
  • Release builds on conda-forge can use the conda-forge pytorch packages for installing build-time dependencies, but not require pytorch as a dependency; build variants to include it as a dependency if we want
  • Local development/testing need not switch on the CMake flag
  • We won't need to build OpenMM or its conda packages with both TensorFlow or PyTorch for now (or possibly ever), but could offer build variants (with pytorch, or with tensorflow) if we do want to prebundle one of them at a time with the conda-forge packages for OpenMM 8.

I'm still focused on the issue of "how can we eliminate friction in the OpenMM user experience" here, so this doesn't sound insane to me as a tradeoff for conda-installable pytorch-compatible OpenMM. That doesn't mean this is the best way to do it, but it's important we consider all the alternatives from the user perspective.

It sounds like we should also continue to explore the plugin idea as well.

@peastman
Copy link
Member

peastman commented Sep 6, 2021

You'll have to trust my judgement on this: the plugin approach will be far less work in the long run. "Not much worse than the CUDA nightmares" is a very low bar to beat! And once it's implemented, it will continue to work with no further effort. We won't have to worry about it breaking because of a change to some other package in the future.

@jchodera
Copy link
Member

jchodera commented Sep 6, 2021

It does seem like it would be a broadly useful feature to better support a plugin ecosystem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants