Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Procedure for the clean up of the database #2844

Open
chills-eclipse opened this issue Aug 14, 2024 · 12 comments
Open

Procedure for the clean up of the database #2844

chills-eclipse opened this issue Aug 14, 2024 · 12 comments
Assignees

Comments

@chills-eclipse
Copy link

Hi John. We need a procedure on how to clean up the database from time to time to prevent the size to become uncontrollable. Can you please provide us the steps on what we can clean etc. Thanks

@denisroy
Copy link

@kineticsquid
Copy link
Contributor

@chills-eclipse Yep, something we need to get back to. @amvanbaren Thoughts?

@amvanbaren
Copy link
Contributor

@kineticsquid You want to move forward on eclipse/openvsx#888?

@kineticsquid
Copy link
Contributor

@amvanbaren MS doesn't limit extension versions. I'd like to see what other options we have.

The biggest offender is the file table. And there appear to be two API calls that could reference the table,

  • /api/.../logo/...
  • /api/.../file/...

Are there other API calls that reference the table?

I'm wondering if then consider where these API calls are coming from, we might be able to limit the size of the able. Where are calls to this API originating?

  • open-vsx.org UI
  • IDEs like Theia
  • npx ovsx commands like get

In this Gitlab issue, https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/4797, @denisroy provided a dump of the access logs and I took a look at the use of the API

@kineticsquid
Copy link
Contributor

@amvanbaren Now that we've completed eclipse/openvsx#888, I'm thinking we need a script to scrub and reduce the size of the file resources table. And, for safety's sake, should probably take a backup first. And given that, I'm thinking we need a read only mode, or at least a mode that disables publishing while we work on the table. Thoughts?

@amvanbaren
Copy link
Contributor

Yes, now that eclipse/openvsx#1045 has been completed we can remove resource files from the file_resource table. This can be done through a migration similar to https://github.com/eclipse/openvsx/blob/master/server/src/main/java/org/eclipse/openvsx/migration/SetPreReleaseJobRequestHandler.java

@kineticsquid
Copy link
Contributor

@amvanbaren I see the code, but I'm not sure I understand how the pieces fit together. Are you saying that we run this for each extension? If so, does that remove the need for R/O mode or to pause publication?

@kineticsquid
Copy link
Contributor

@tfroment FYI

@amvanbaren
Copy link
Contributor

@kineticsquid It's an example of a migration job that runs in the background. It is for when extra processing is needed that can't be done in a SQL migration script. We can run a similar migration job for each FileResource that is of type resource where it first deletes the file in storage and then deletes the FileResource entity in the database.

Yes, there's no need for a read-only mode or to pause publication.

@kineticsquid
Copy link
Contributor

@amvanbaren Understood. Do we need to first change how we process published extensions to not create entries for all of these files, or have we already made that improvement?

@amvanbaren
Copy link
Contributor

@kineticsquid In eclipse/openvsx#1100 resources are no longer processed and existing resources are deleted.

@kineticsquid
Copy link
Contributor

@amvanbaren Got it. Do we have any more work to do on this one then after eclipse/openvsx#1100 is deployed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants