Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arki-check problems after odimh5 data deletion #279

Open
brancomat opened this issue Nov 18, 2021 · 3 comments
Open

arki-check problems after odimh5 data deletion #279

brancomat opened this issue Nov 18, 2021 · 3 comments
Assignees

Comments

@brancomat
Copy link
Member

I recently deleted some odimh5 from a dataset:

arki-query 'reftime:>=2021-11-04 14:00, <=2021-11-07' /arkivio/arkimet/dataset/odimSPC/ > /tmp/delete.md
arki-check --fix --remove /tmp/delete.md /arkivio/arkimet/dataset/odimSPC/

Now arki-checks with --fix option apparently don't fix anything:

[arkimet@arkiope8 ~]$ arki-check -f /arkivio/arkimet/dataset/odimSPC/
odimSPC:2021/11-17.odimh5: item at offset 60 is wrongly ordered before item at offset 63
odimSPC:2021/11-17.odimh5: segment contains 4 file(s) that the index does now know about
odimSPC: check 22 files ok
[arkimet@arkiope8 ~]$ arki-check -f /arkivio/arkimet/dataset/odimSPC/
odimSPC:2021/11-17.odimh5: item at offset 60 is wrongly ordered before item at offset 63
odimSPC:2021/11-17.odimh5: segment contains 4 file(s) that the index does now know about
odimSPC: check 22 files ok

Since arkiope still has arkimet 1.33 I tried isolating the 11-17.odimh5 dir on a more recent server (arkimet 1.38) obtaining different output but still no fixes:

[branco@kinotto arkitest]$ arki-check -f odimSPC/
odimSPC:2021/11-17.odimh5: removed from the index
odimSPC: check 0 files ok, 1 file removed from index
[branco@kinotto arkitest]$ arki-check -f odimSPC/
odimSPC:2021/11-17.odimh5: removed from the index
odimSPC: check 0 files ok, 1 file removed from index

Test data is about 4.2Gb, I'll add a link shortly, alternatively is available on [email protected]:/arkivio/arkimet/dataset/odimSPC/:2021/11-17.odimh5

@brancomat
Copy link
Member Author

@spanezz
Copy link
Contributor

spanezz commented Nov 18, 2021

I'm attaching a trimmed test data, unpackable with arki-testtar -x, which reproduces the issue: odimSPC.tar.gz

@spanezz
Copy link
Contributor

spanezz commented Nov 22, 2021

In theory the issue is solved: the problem came from a missing .sequence file in the 11-17.odimh5 directory. Previously it was inconsistently detected as a directory segment, now I fixed by ignoring directories that don't contain a .sequence file.

The idea is that in the future we might want to implement directory-based segment in some other format, and I don't want to claim any directory with the right name as a directory segment using the current format.

Now arki-check in the test data is fine, but the contents of a dir segment without a .sequence file will be ignored and not, for example, deleted by a repack. Ideally, there is no reason for a .sequence file to disappear, and if it does, it can be considered dataset corruption to be fixed manually.

I notice however that on the dataset at arkiope8 the .sequence file is present, and I'm a bit puzzled because I could swear that it didn't exist in the test tarball.

Can you try the fixed arkimet to see if it solves the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants