add option to disable checksum verification #153

einsteinx2 · 2024-03-17T15:54:47Z

closes #118

I saw that there was a previous PR from a few years ago addressing this that was never merged. This is an updated solution based off of the top of main, and is a simpler implementation with fewer changes.

Tested on my Debian 12 home NAS server and it behaves as expected.

The drive I was checking is over 16TiB of usable space with over 500k files. It took overnight to run without doing a checksum (only checking names/sizes and then following up with first/last bytes). If I had checksumming on, it would still be running, and for days most likely. The output without the checksum was still very useful as I'll explain below.

To the maintainer(s), I think there is a real use case for this:

In my example, I have a NAS with dozens of TB of storage. I have a lot of files that are significantly large (dozens to hundreds of GB). I've been doing some file reorganization and my goal is just to get a short list of files that might be duplicates that I can then process later.

In my case, if these large files have the same name and size and first/last bytes, I can be basically 100% sure they're the same file. It would take exponentially longer to checksum every file at discovery time only to find out what I already know...they're the same.

Later if needed I can manually checksum files in the results list, or even do something like test/checksum random byte ranges in the files rather than checksumming the whole thing to save a ton of time, but in my case (and I assume others' as well) I already know they're the same so even that isn't needed.

To be clear, I think the default should stay the same (or maybe even go up to sha256/512), but at least having the option to disable checksumming can be very useful and the change is only a few lines.

maicardi · 2024-11-17T11:26:36Z

Thanks @einsteinx2 for your option.

I have patched rdfind 1.6 with your suggested changes and it works like a charm.

Thanks.

add option to disable checksum verification

1b591d0

einsteinx2 mentioned this pull request Mar 17, 2024

allow for '-checksum none' to disable checksum filtering. #118 #119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add option to disable checksum verification #153

add option to disable checksum verification #153

einsteinx2 commented Mar 17, 2024 •

edited

Loading

maicardi commented Nov 17, 2024

add option to disable checksum verification #153

Are you sure you want to change the base?

add option to disable checksum verification #153

Conversation

einsteinx2 commented Mar 17, 2024 • edited Loading

maicardi commented Nov 17, 2024

einsteinx2 commented Mar 17, 2024 •

edited

Loading