Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete objects by path from s3 object storage #87

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

MedvedewEM
Copy link
Contributor

Sometimes we need to delete data from S3 Object Storage manually when replicas with local metadata already deleted and no backups refer to data.
So there is no another way to clear S3 except explicitly iterate all objects by prefix and delete it.

Now it works like:

 # chadmin object-storage list
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/ees/ewcpleokxjcsldqkgtiqpxbvpwlsp", "size": 1, "last_modified": "2023-12-12 16:49:00.117000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/metadata_version.txt"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/jeu/patimrwqaqnaojatfzjmzgesojlur", "size": 100, "last_modified": "2023-12-12 16:49:00.113000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/columns.txt"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/lzh/sjydftjppnoghiqdmelftexlubswr", "size": 4, "last_modified": "2023-12-12 16:49:00.120000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/minmax_pickup_date.idx"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/tuy/rootgostdyrqdkcmgshbrduzpdzfn", "size": 88, "last_modified": "2023-12-12 16:49:00.112000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/data.bin"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/tvl/oisrcewacvpvkwhrqvvyhhitbqxav", "size": 62, "last_modified": "2023-12-12 16:49:00.078000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/data.cmrk3"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/vms/ljpfduakpcwyvbzcihluncyvcvkzn", "size": 4, "last_modified": "2023-12-12 16:49:00.078000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/partition.dat"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/vwa/qprmeaqtxhskarsmupbnhpfbkwgmz", "size": 10, "last_modified": "2023-12-12 16:49:00.078000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/default_compression_codec.txt"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/wmw/upddckttflhqvcdictycubzdutdlo", "size": 331, "last_modified": "2023-12-12 16:49:00.083000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/checksums.txt"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/wpg/xujlxchstrwnmnhzfnwemtsceuety", "size": 235, "last_modified": "2023-12-12 16:49:00.108000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/serialization.json"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/xbo/dsiqwlsxghdlsxbuahcnnblvwsksc", "size": 42, "last_modified": "2023-12-12 16:49:00.119000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/primary.cidx"]}
{"object": {"key": "cloud_storage/e4u21vnmivo70tppvttj/shard1/xfg/impnnnrxtqkskgkweaoarvlvadpsn", "size": 1, "last_modified": "2023-12-12 16:49:00.110000+00:00"}, "files": ["/var/lib/clickhouse/disks/object_storage/store/9a0/9a0182da-9530-44f5-9c70-6dd595e31f9d/202312_2_2_0/count.txt"]}

# chadmin object-storage clean --prefix=cloud_storage/e4u21vnmivo70tppvttj/shard1
Deleted 11 objects from bucket [cloud-storage-e4u21vnmivo70tppvttj]

PS fix util's method chunked, because current code does not work (infinite loop).

@MedvedewEM MedvedewEM self-assigned this Dec 12, 2023
path_prefix = disk.prefix

for obj in bucket.objects.filter(Prefix=path_prefix + object_name_prefix):
name: str = obj.key[len(path_prefix) :]

if _is_ignored(name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean command ignores objects with IGNORED_OBJECT_NAME_PREFIXES. But when removing data from S3 for deleted shards, I don't think we need any exceptions. We rather want to delete everything under the specified prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with skip_ignoring parameter

@MedvedewEM MedvedewEM force-pushed the object_storage_delete_path branch from 7fe8b66 to 68d0e27 Compare December 13, 2023 07:46
@MedvedewEM MedvedewEM force-pushed the object_storage_delete_path branch from 68d0e27 to 38e2ae7 Compare December 13, 2023 07:47
@MedvedewEM MedvedewEM merged commit 86cca4e into main Dec 13, 2023
18 checks passed
@MedvedewEM MedvedewEM deleted the object_storage_delete_path branch December 13, 2023 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants