Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Image Training Endpoint to Django #1011

Closed
wants to merge 22 commits into from
Closed

Conversation

codingwithsurya
Copy link
Contributor

@codingwithsurya codingwithsurya commented Oct 7, 2023

Migrate Image Training Endpoint to Django

Github Issue Number Here: Feature#993

What user problem are we solving?

With the ongoing migration from Flask Blueprint to Django for better scalability and maintainability, we need to ensure that the training endpoints for image datasets continue to function as expected. Users need a reliable endpoint to submit image training requests and this migration aims to improve the overall infrastructure.

What solution does this PR provide?

This PR introduces the /image route within the Django framework, replacing the old Flask Blueprint route. The new route handles image training requests, particularly focusing on classification tasks as regression is not supported. The structure of this endpoint is mirrored from the specified reference, with adjustments made to fit the Django framework. The old image train code served as inspiration for ensuring that the necessary features and behaviors are retained.

Testing Methodology

Testing yet to be done but to verify the functionality and integrity of the new /image route, but when testing begins, we can submit training requests using torch built-in datasets (MNIST, FashionMNIST) to ensure that models train as expected and the endpoint returns accurate results.

Also use postman for testing .

Any other considerations

@codingwithsurya codingwithsurya added the backend backend tasks label Oct 7, 2023
@codingwithsurya codingwithsurya self-assigned this Oct 7, 2023
training/training/core/dataset.py Show resolved Hide resolved
training/training/core/dataset.py Outdated Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/image.py Outdated Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/schemas.py Show resolved Hide resolved
@codingwithsurya codingwithsurya linked an issue Oct 7, 2023 that may be closed by this pull request
@codingwithsurya
Copy link
Contributor Author

codingwithsurya commented Oct 7, 2023

@karkir0003 @dwu359

took a stab at this. a couple questions i had was.

  1. What should the correct path be in ImageDefaultDatasetCreator constructor (dataset.py)? Here is the code I am referring to:

self.train_set = datasets.__dict__[dataset_name](root='./backend/image_data_uploads', train=True, download=True, transform=self.train_transform)

self.test_set = datasets.__dict__[dataset_name](root='./backend/image_data_uploads', train=False, download=True, transform=self.test_transform)

With transition to django, is the path for data uploads still gonna be in backend/data_uploads?

  1. Does my image.py look good in /routes/image? Is there any particular i have to add for image training?

  2. Any additional schemas for image classification? I think i got the right ones.

If everything looks good here (no red flags in code), i can begin testing with our default datsets to make sure our trainspace flow is working as intended.

@karkir0003
Copy link
Member

for image data uploads, please dont use backend/ dir.

@codingwithsurya
Copy link
Contributor Author

for image data uploads, please dont use backend/ dir.

sure, then what directory do i use? I dont see a similar directory in /training

Copy link
Member

@karkir0003 karkir0003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left comments

training/training/core/dataset.py Show resolved Hide resolved
training/training/core/dataset.py Outdated Show resolved Hide resolved
training/training/core/dataset.py Outdated Show resolved Hide resolved
training/training/core/dataset.py Show resolved Hide resolved
training/training/core/dataset.py Outdated Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/image.py Show resolved Hide resolved
training/training/routes/image/image.py Outdated Show resolved Hide resolved
Copy link
Member

@karkir0003 karkir0003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more comments @codingwithsurya

training/training/core/dataset.py Outdated Show resolved Hide resolved
@karkir0003
Copy link
Member

i think u can test the endpoint out @codingwithsurya

@karkir0003
Copy link
Member

for image data uploads, please dont use backend/ dir.

sure, then what directory do i use? I dont see a similar directory in /training

should be clarified

@codingwithsurya
Copy link
Contributor Author

for image data uploads, please dont use backend/ dir.

sure, then what directory do i use? I dont see a similar directory in /training

should be clarified

yea looks good. i just commited what we talked about. and i'll test on colab as well soon when i get a chance.

)
for epoch_result in trainer:
print(epoch_result)
print(trainer.labels_last_epoch, trainer.y_pred_last_epoch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these print statements necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw this in the original tabular.py as well. if they're not necessary, i can remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@farisdurrani @karkir0003 any clarification on what to do here. Are all the print statements necessary in grand scheme of things?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we resolve this @codingwithsurya

Copy link
Member

@karkir0003 karkir0003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added minor comments on static typing python

@DSGT-DLP DSGT-DLP deleted a comment from karkir0003 Oct 13, 2023
@codingwithsurya
Copy link
Contributor Author

added minor comments on static typing python

addressed all these

@codingwithsurya
Copy link
Contributor Author

codingwithsurya commented Oct 13, 2023

TO-DO: find viable nnSequential PyTorch MNIST Architecture for testing

@codingwithsurya
Copy link
Contributor Author

TODO: Cross-Entropy Loss not computing correctly. Figure out correct param for FLATTEN in current architecture.

Otherwise, we can make our own

for module in my_model.model.modules()
if not isinstance(module, nn.Sequential)
] == input_list
import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "pytest" could not be resolved (reportMissingImports)

@@ -21,0 +1,20 @@
import pytest
import torch.nn as nn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "torch.nn" could not be resolved (reportMissingImports)

)
def test_parse_user_architecture(user_model, expected):
assert [i == j for i, j in zip(parse_deep_user_architecture(user_model), expected)]
import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "pytest" could not be resolved (reportMissingImports)

@@ -30,0 +1,29 @@
import pytest
import torch.nn as nn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "torch.nn" could not be resolved (reportMissingImports)

loss_function_name = "BCELOSS"
computed_loss = compute_loss(loss_function_name, output, labels)
assert pytest.approx(expected_number) == computed_loss
import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "pytest" could not be resolved (reportMissingImports)

@@ -1,8 +1,11 @@
from typing import Any, Callable
from ninja import Schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "ninja" could not be resolved (reportMissingImports)

for epoch_result in trainer:
print(epoch_result)
from typing import Literal, Optional
from django.http import HttpRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "django.http" could not be resolved (reportMissingImports)

@@ -66,0 +1,65 @@
from typing import Literal, Optional
from django.http import HttpRequest
from ninja import Router, Schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "ninja" could not be resolved (reportMissingImports)

from training.core.criterion import getCriterionHandler
from training.core.dataset import SklearnDatasetCreator
from training.core.dl_model import DLModel
from torch.utils.data import DataLoader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Import "torch.utils.data" could not be resolved (reportMissingImports)

drop_last=True,
)

model = DLModel.fromLayerParamsList(tabularParams.user_arch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [pyright] reported by reviewdog 🐶
Argument of type "list[LayerParams]" cannot be assigned to parameter "layer_params_list" of type "list[LayerParams]" in function "fromLayerParamsList"
  "builtins.list" is incompatible with "builtins.list"
    Type parameter "_T@list" is invariant, but "training.training.routes.tabular.schemas.LayerParams" is not the same as "training.training.routes.image.schemas.LayerParams"
    Consider switching from "list" to "Sequence" which is covariant (reportGeneralTypeIssues)

@codingwithsurya
Copy link
Contributor Author

codingwithsurya commented Nov 19, 2023

Update: MNIST dataset testing works!

TODO: Train/test transforms as parameter to endpoint. CIFAR10 Testing.

@codingwithsurya codingwithsurya marked this pull request as ready for review November 19, 2023 20:48
@codingwithsurya codingwithsurya requested a review from a team as a code owner November 19, 2023 20:48
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend backend tasks
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Migrate Image Training Endpoint
3 participants