-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support vacuum leaked table data #17022
Conversation
6713ec5
to
6794267
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like a safe operation? Will it cause the newly created tables to be cleaned up incorrectly?
A newly created table will first create metadata, and at this point, there will be no corresponding files in the storage. Only after executing DML operations will the corresponding directory be created. Therefore, if a file belongs to a valid table, there must be a corresponding record in the metadata.So, files without corresponding metadata records can be safely deleted. |
I noticed that the list path comes before the list table, so that should be fine. |
This reverts commit 3a9f404.
This reverts commit 3a9f404.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Leaked table refers to data that is not recorded in the metadata but still occupies storage space, possibly due to a bug in the vacuum process. This PR allows the use of vacuum drop table [from database] force to clean up leaked tables.
For example:
Construct leaked data
Execute vacuum without force
leaked data is not vacuumed:
sky@hp:~/databend$ ls .databend/stateless_test_data/121460/ 121467
Execute vacuum with force
leaked data is vacuumed:
Tests
Type of change
This change is