Skip to content

Commit

Permalink
slight rewording in the notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
Blazej Banaszewski committed Aug 5, 2024
1 parent 8db312e commit dd0b2b2
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions notebooks/downstream_adaptation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@
"source": [
"from minimol import Minimol\n",
"\n",
"featuriser = Minimol(batch_size=50)"
"featuriser = Minimol()"
]
},
{
Expand Down Expand Up @@ -539,7 +539,7 @@
"\n",
"- Rather than choosing the model at the last epoch, we will use best validation loss to decide which one to choose.\n",
"\n",
"We already implemented a `dataloader_factory()` method that creates a new training and validation dataloader for each fold. Now, we will also build a method for ensemble-based evaluation, that takes a list of models and where probabiltiies for each sample from all of them are averaged, creating an ensemble. "
"We already implemented a `dataloader_factory()` method that creates a new training and validation dataloader for each fold. Now, we will also build a method for ensemble-based evaluation, that uses a list of models to caculate the average logits for the prediction."
]
},
{
Expand Down Expand Up @@ -763,7 +763,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can see, in less than 14.1s the ensemble was trained and reached performance of 0.9975 in AUROC on the test set. This is slightly better than the performance we achieved with a single model, but more importantly, the ensemble is not senstitive to which part of the data is used for validation (because we train n models on n folds), and is less sensitive to the intialisation because we intialise n models getting somewhere close to an average performance."
"In about 15s an ensemble was build reaching the performance of 0.9975 in AUROC on the test set. This is slightly better than the performance we achieved with a single model, but more importantly, the ensemble is not senstitive to which part of the data is used for validation, and is less sensitive to the intialisation because we intialise n models getting somewhere close to an average performance.\n",
"\n",
"This score is better than the SoTA, showcasing how powerful MiniMol is in featurising molecules for downstream biological tasks."
]
}
],
Expand Down

0 comments on commit dd0b2b2

Please sign in to comment.