Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rosetta Evolutionary Ligand (REvoLd) #303

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

Hackefleisch
Copy link
Member

@Hackefleisch Hackefleisch commented Jan 20, 2025

This PR aims to include the newly developed application REvoLd into Rosetta.

REvoLd preprint: https://arxiv.org/abs/2404.17329

The algorithm optimizes small molecule ligands within combinatorial libraries like Enamine REAL. Most code changes are self enclosed and do not affect any functionality outside of the application. The only exception are changes to cmake compile directives for mpi which were broken.

A guide on how to run REvoLd is available here: https://docs.rosettacommons.org/docs/wiki/revold

Unit and integration test run locally and passed all.

Paul Eisenhuth added 30 commits October 7, 2024 23:39
…ne and get rid of evolution option file if not needed
@Hackefleisch Hackefleisch added ready_for_review This PR is ready to be reviewed and merged. 90 standard tests labels Jan 20, 2025
@@ -0,0 +1,69 @@
// -*- mode:c++;tab-width:2;indent-tabs-mode:t;show-trailing-whitespace:t;rm-trailing-spaces:t -*-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For organizational purposes, it might be worth putting the app in a subdirectory, rather than in the root applications dir. (Probably something suitably generic, like ligand_design.)

Copy link
Member

@roccomoretti roccomoretti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks rather good -- Some minor comments, but should be more-or-less good to go once testing fully passes.

Comment on lines 40 to 48
#ifdef USEMPI
MPI_Init (&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
MPI_Comm_size( MPI_COMM_WORLD, &size );

if ( size < 10 ) {
TR.Warning << "Running REvoLd less than 10 CPUs causes very large runtimes. Consider using more CPUs." << std::endl;
}
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautifier doesn't necessarily run on #ifdef blocks -- you may need to adjust formatting manually. (Here use tabs instead of spaces.)


if ( !basic::options::option[ basic::options::OptionKeys::ligand_evolution::options ].active() ) {
TR << "Path to evolutionary options file not set. Default settings will be used." << std::endl;
evoopt = EvolutionOptionsOP(new EvolutionOptions());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor) Using utility::pointer::make_shared() is preferred to directly calling new.

#endif
}

std::string EvolutionManager::print_scores() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly misleading name, as it doesn't actually print the score, just returns them as a string.

}
} catch( ... ) {
// print message and continue scoring
TR.Error << "Unable to score " << library_.identifier_to_smiles( { ii, 0 } ) << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the internal score annotation left in a reasonable state?


/// @brief Maps selector names to types and parameter list
std::map< std::string, std::pair< std::string, std::map< std::string, core::Real > > > selector_options_ {
{ "remove_elitist", { "elitist", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initializer lists don't have to be left-aligned. With such heavily nested data structures, you can (and probably should) indent them to be better readable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was indented, but the beautifier squashed everything left. Is there a way to prevent it from doing so?

}

Reagent::~Reagent() {
delete fingerprint_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to use a std::shared_ptr (or std::unique_ptr) versus using manual memory management.

// rdkit requires some weird vector and vector of vector shenanigans
RDKit::MOL_SPTR_VECT react;
for ( core::Size ii( 1 ); ii <= n_reaction_positions; ++ii ) {
react.push_back( RDKit::RWMOL_SPTR( reagents_[ reagent_indices[ ii ] ]->mol_ ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reagents_[ reagent_indices[ ii ] ]->mol_ should already be a RDKit::RWMOL_SPTR, right?

TR.Debug << "Len of outer vector: " << products.size() << std::endl;
TR.Debug << "Len of inner vector: " << products[ 0 ].size() << std::endl;

return RDKit::MolToSmiles( RDKit::RWMol( *products[ 0 ][ 0 ] ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to make a copy of the RWMol here? MolToSmiles should be able to work on the dereferenced ROMol directly.

RDKit::ChemicalReaction* reac_;
core::Size possible_molecules_ = 1;

friend FragmentLibrary;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid friend if at all possible. It took me way to long to understand where/how reagents_ was getting set -- simply having an "add_reagent()" function might better help encapsulate things and make the flow clearer.

#include <protocols/ligand_evolution/offspring_factory/OffspringFactory.hh>


namespace protocols {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that you don't need to have a cc file if there's nothing to put in it. (Especially pure virtual classes can be header-only.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
90 standard tests ready_for_review This PR is ready to be reviewed and merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants