Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quota related issues #1748

Open
3 of 5 tasks
ihsaan-ullah opened this issue Feb 10, 2025 · 14 comments
Open
3 of 5 tasks

Quota related issues #1748

ihsaan-ullah opened this issue Feb 10, 2025 · 14 comments
Labels
Bug Enhancement Feature suggestions and improvements P1 High priority, but NOT a current blocker

Comments

@ihsaan-ullah
Copy link
Collaborator

ihsaan-ullah commented Feb 10, 2025

TODOS:

  • Reset User Quota from bytes to GB
    Solved by User quota is updated to GB from Bytes #1749

  • Find the use of GiB/MiB/KiB in storage analytics and other places and replace with GB/MB/KB
    Solved by: File Sizes cleanup #1752

  • Size of files is formatted at different places with different functions, use only one size formatter everywhere
    Solved by: File Sizes cleanup #1752

  • Check why the submission size in used quota doubles when submission finishes
    Maybe because of submission file saved in the prediction output

  • Find and fix the NAN file size issue

For more details check the comments section of this PR: #1738

@Didayolo Didayolo added Bug Enhancement Feature suggestions and improvements labels Feb 10, 2025
@ObadaS
Copy link
Collaborator

ObadaS commented Feb 11, 2025

Concerning the Quota units, the unit in the Django Admin Interface is in bits (or bytes not sure), making it hard to know how much quota is actually assigned at a quick glance

Image

@Didayolo
Copy link
Member

NaN values for size on production server for old files (older than ~30 days)

Image

@Didayolo Didayolo added the P1 High priority, but NOT a current blocker label Feb 11, 2025
@ihsaan-ullah
Copy link
Collaborator Author

ihsaan-ullah commented Feb 11, 2025

check format_file_size function in src/static/riot/submissions/resource_submissions.tag

Edit:

No need to check this for NA size. The problem is explained below in comments

@ihsaan-ullah
Copy link
Collaborator Author

ihsaan-ullah commented Feb 12, 2025

@ObadaS I read somewhere that minio files meta-data can expire and file_size is a metadata that may be affected if there is an expiry role. If this is true then this maybe the cause of NAN size

Edit:

This was not the problem. The problem is explained in the comments below

@ObadaS
Copy link
Collaborator

ObadaS commented Feb 12, 2025

Interesting, I will check that tomorrow.
However, there is no NaN size problem on the codabench-test website, which is connected to another MinIO that might have a different configuration.

Codalab.lisn.upsaclay.fr also shows dataset size and does not seem to be impacted by the bug, even though they are both connected to the same MinIO, which could point to either a bucket problem on MinIO or a problem in the code of the plateform

@ihsaan-ullah
Copy link
Collaborator Author

The only problem right now is reproducibility. If we can reproduce the bug locally then solving it will not be a problem.

I am pretty much convinced that there is no problem in the code because we see sizes for some files but I will investigate the code a bit more to be sure

@ihsaan-ullah
Copy link
Collaborator Author

We found that the file sizes are emptied by reset_computed_storage_analytics function. Rerunning storage analytics should fix the NA sizes

@Didayolo
Copy link
Member

We found that the file sizes are emptied by reset_computed_storage_analytics function. Rerunning storage analytics should fix the NA sizes

I started the storage analytics on production.

What starts the reset_computed_storage_analytics in the first place?

@ihsaan-ullah
Copy link
Collaborator Author

I confirmed using this code that accessing file size from minio takes longer than accessing the file_size from DB. We can run it on codabench-test to be sure.

Accessing from DB

import time
from datasets.models import Data

start_time = time.time()
sizes = sum(item.file_size for item in Data.objects.all())  
end_time = time.time()

execution_time = end_time - start_time
print(f"Execution Time: {execution_time:.4f} seconds")

Accessing from Minio

import time
from datasets.models import Data

start_time = time.time()
sizes = sum(item.data_file.size for item in Data.objects.all())  
end_time = time.time()

execution_time = end_time - start_time
print(f"Execution Time: {execution_time:.4f} seconds")

@Didayolo
Copy link
Member

Didayolo commented Feb 18, 2025

We need to separate the computing of the file sizes from the analytics task:

def create_storage_analytics_snapshot():

    # Measure all files with unset size
    for dataset in Data.objects.filter(Q(file_size__isnull=True) | Q(file_size__lt=0)):
        try:
            dataset.file_size = Decimal(
                dataset.data_file.size / 1024
            )  # file_size is in KiB
        except Exception:
            dataset.file_size = Decimal(-1)
        finally:
            dataset.save()

@Didayolo
Copy link
Member

Didayolo commented Feb 18, 2025

New version of the script to compute file sizes:

from datasets.models import Data
from decimal import Decimal

print("Total objects: ", Data.objects.all().count())

datasets = Data.objects.all().order_by("id")
print("Processing now: ", datasets.count())

for dataset in datasets:
    if dataset.data_file and hasattr(dataset.data_file, 'size'):
        try:
            file_size = dataset.data_file.size
            if file_size <= 0 or file_size is None:
                file_size = Decimal(0)
            else:
                file_size = Decimal(file_size) / 1024
            dataset.file_size = file_size
            dataset.save()
        except Exception as e:
            print(f"Skipping dataset ID {dataset.id}: Invalid file size ({dataset.data_file.size}) - Error: {e}")
    else:
        print(f"File size problem, Data ID {dataset.id}")

@Didayolo
Copy link
Member

The file sizes are back on production.

@ihsaan-ullah
Copy link
Collaborator Author

@Didayolo
Copy link
Member

List of data files that failed:
Errors.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Enhancement Feature suggestions and improvements P1 High priority, but NOT a current blocker
Projects
None yet
Development

No branches or pull requests

3 participants