corrected order of valid and train data arguments in DomainDataModule #82

NicolasKuske · 2024-05-19T12:05:52Z

here the order of validation and training where mixed up. This error goes all the way through to the trainer. I.e., training data and validation data get swapped!

Maybe adapt the code with keyword arguments so that the order does not matter?

PS:

If you want to double check the length of valid and train data used in the trainer, add in class DomainDataModule:

def get_train_length(self):
        return len(self.train_dataset)
def get_val_length(self):
        return len(self.val_dataset)

Then add before the unimodal module the following class:

from lightning.pytorch.callbacks import Callback


class DatasetLengthLogger(Callback):
    def on_train_end(self, trainer, pl_module):
        train_loader = trainer.datamodule.train_dataloader()
        val_loader = trainer.datamodule.val_dataloader()
        
        train_length = len(train_loader.dataset)
        val_length = len(val_loader.dataset)
        
        print(f"Training Data Length: {train_length}")
        print(f"Validation Data Length: {val_length}")

Finally, in def train module, before calling the trainer, add dataset_length_logger = DatasetLengthLogger()

And in trainer, add dataset_length_logger to the callbacks:

callbacks=[
            ModelCheckpoint(
                dirpath="checkpoints",
                filename=module_name,
                monitor="val_loss",
                mode="min",
                save_top_k=1,
            ),
            dataset_length_logger
        ],

Result:
Training Data Length 128
Validation Data Length 256

here the order of validation and training where mixed up. This error goes all the way through to the trainer. I.e., training data and validation data get swapped! Maybe adopt the code with keyword arguments so that the order does not matter? To double check the length of valid and train data used in the trainer, add in class DomainDataModule: def get_train_length(self): return len(self.train_dataset) def get_val_length(self): return len(self.val_dataset) Then add before the unimodal module the following class: from lightning.pytorch.callbacks import Callback class DatasetLengthLogger(Callback): def on_train_end(self, trainer, pl_module): train_loader = trainer.datamodule.train_dataloader() val_loader = trainer.datamodule.val_dataloader() train_length = len(train_loader.dataset) val_length = len(val_loader.dataset) print(f"Training Data Length: {train_length}") print(f"Validation Data Length: {val_length}") Finally, in def train module, before calling the trainer, add dataset_length_logger = DatasetLengthLogger() And in trainer, add dataset_length_logger to the callbacks: callbacks=[ ModelCheckpoint( dirpath="checkpoints", filename=module_name, monitor="val_loss", mode="min", save_top_k=1, ), dataset_length_logger ], Result: Training Data Length 128 Validation Data Length 256

bdvllrs

Thanks! Can you update the code in examples as well?

Also GWDataModule uses the same order (val, then train) can you also change it to be consistent (always train first)?
(You also have to change the code for loading the model)

NicolasKuske changed the title ~~corrected order of valid and train data in DomainDataModule~~ corrected order of valid and train data arguments in DomainDataModule May 19, 2024

bdvllrs requested changes May 21, 2024

View reviewed changes

bdvllrs mentioned this pull request May 21, 2024

Update dataset.py corrected order of valid and train arguments in DomainDataModule #83

Merged

also change for GWDataModule

77d96cc

bdvllrs approved these changes Sep 19, 2024

View reviewed changes

bdvllrs merged commit 48dc6ca into main Sep 19, 2024
1 check passed

bdvllrs deleted the NicolasKuske-patch-6 branch September 19, 2024 09:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

corrected order of valid and train data arguments in DomainDataModule #82

corrected order of valid and train data arguments in DomainDataModule #82

NicolasKuske commented May 19, 2024

bdvllrs left a comment

corrected order of valid and train data arguments in DomainDataModule #82

corrected order of valid and train data arguments in DomainDataModule #82

Conversation

NicolasKuske commented May 19, 2024

bdvllrs left a comment

Choose a reason for hiding this comment