Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Sizes cleanup #1752

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open

File Sizes cleanup #1752

wants to merge 3 commits into from

Conversation

ihsaan-ullah
Copy link
Collaborator

@ihsaan-ullah ihsaan-ullah commented Feb 18, 2025

@ mention of reviewers

@Didayolo

A brief description of the purpose of the changes contained in this PR.

The way of dealing with file sizes was not uniform in the platform. At some places size was stored in Bytes and in other places in KiB. This is now fixes and we store bytes in the db and use KB/MB/GB instead of KiB/MiB/GiB. The size formatter used were different at different places. Now we have one formatter named pretty_bytes that is declared in both javascript and python.

Issues this PR resolves

Important Note

I have left some size unit conversion in the following files because there is a confusion in what is going on in these files. I cannot see any data in the analytics to match the data with code BUT once analytics start working then I will check these

Important Todos for deployment:

We have some critical changes here so before deployment we should run the following 3 blocks of code to get the last ids of Data, Submission and SubmissionDetail

# Get the maximum ID for Data
from datasets.models import Data
latest_id_data = Data.objects.latest('id').id
print("Data Last ID: ", latest_id_data)
# Get the maximum ID for Submission
from competitions.models import Submission
latest_id_submission = Submission.objects.latest('id').id
print("Submission Last ID: ", latest_id_submission)
# Get the maximum ID for Submission Detail
from competitions.models import SubmissionDetails
latest_id_submission_detail = SubmissionDetails.objects.latest('id').id
print("SubmissionDetail Last ID: ", latest_id_submission_detail)

After we have the latest ids, we should deploy and run the 3 blocks of code below to fix the sizes i.e. to convert all kib to bytes to make everything consistent. For new files uploaded after the deployment, the sizes will be saved in bytes automatically that is why we need to run the following code for older files only.

# Run the conversion only for records with id <= latest_id
from datasets.models import Data
for data in Data.objects.filter(id__lte=<latest_data_id>):
    if data.file_size:
        data.file_size = data.file_size * 1024  # Convert from KiB to bytes
        data.save()
# Run the conversion only for records with id <= latest_id
from competitions.models import Submission
for sub in Submission.objects.filter(id__lte=<latest_sub_id>):
    updated = False  # Track if any field is updated
    if sub.file_size:
        sub.file_size = sub.file_size * 1024  # Convert from KiB to bytes
        updated = True
    
    if sub.prediction_result_file_size:
        sub.prediction_result_file_size = sub.prediction_result_file_size * 1024  # Convert from KiB to bytes
        updated = True
    
    if sub.scoring_result_file_size:
        sub.scoring_result_file_size = sub.scoring_result_file_size * 1024  # Convert from KiB to bytes
        updated = True

    if sub.detailed_result_file_size:
        sub.detailed_result_file_size = sub.detailed_result_file_size * 1024  # Convert from KiB to bytes
        updated = True
    
    if updated:
        sub.save()
# Run the conversion only for records with id <= latest_id
from competitions.models import SubmissionDetails
for sub_det in SubmissionDetails.objects.filter(id__lte=<latest_sub_det_id>):
    if sub_det.file_size:
        sub_det.file_size = sub_det.file_size * 1024  # Convert from KiB to bytes
        sub_det.save()

Checklist

  • Code review by me
  • Hand tested by me
  • I'm proud of my work
  • Code review by reviewer
  • Hand tested by reviewer
  • CircleCi tests are passing
  • Ready to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant