Skip to content

Main code chunks used for models in the publication "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"

Notifications You must be signed in to change notification settings

ffstghc/caco2ml

Repository files navigation

"Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison to the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"

American Chemical Society (ACS): Journal of Chemical Information and Modeling (JCIM)

Frank Filip Steinbauer, Thorsten Lehr, Andreas Reichel

Repository for archiving the main code chunks used for the local and global machine learning models in the publication "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database" published in 2024 in ACS Journal of Chemical Information and Modeling (JCIM) as 1st publication of my doctoral studies at Bayer.

The five different included files contain the main code chunks for:

  1. Data preparation (SMILES/molecule object standardization; PaDEL descriptor calculation)
  2. Global models (including other descriptor calculations and recursive feature elimination with cross-validation as well as external TDC benchmarking1)
  3. Local model (training data selection via fixed tanimoto similarity criteria)
  4. Local model (training data selection via fixed amounts of most similar structuress)
  5. Local model (training data selection via kNN2 as control/proof of superiority of the chosen tanimoto similarity approach)

If you have further questions or need additional parts of the utilized code for your own studies, feel free to contact [email protected].

About

Main code chunks used for models in the publication "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages