validation split #8

arbelhizmi · 2024-12-17T12:54:27Z

In main_dino.py file, you're doing validation_split - why do you need this? in self supervised DINO training there are no labels, so how it's possible to do validation? and if there is no use for validation, why are you splitting the dataset into train and validation?
Also, what are the self.target_transform in the Multichannel_dataset dataset (I understand this attribute is built-in, but why do you use this when you train in self supervised manner)?

The text was updated successfully, but these errors were encountered:

pfaendler · 2025-01-06T08:23:07Z

We did this so that we can always use a subset of the data, that wasn't part of the training, for downstream analyses (e.g. plotting a umap to see how cells cluster). We think that this is, even though the label is not used in the self-supervised setting, cleaner, i.e. it allows us to investigate how images of cells that were never seen by the model before, are then handled or 'viewed' by the model afterwards.
And regarding the self.target_transform, we tried to implement the custom dataset as usually done in a pytorch setting and so we kept it. Additionally, we, if we have label information, use the label in the downstream analyses after self supervised training and then this allows us to use the same dataloader for that.
I hope that clarified things!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validation split #8

validation split #8

arbelhizmi commented Dec 17, 2024

pfaendler commented Jan 6, 2025

validation split #8

validation split #8

Comments

arbelhizmi commented Dec 17, 2024

pfaendler commented Jan 6, 2025