-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support for writing to mmapped TIFFs #40
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #40 +/- ##
==========================================
- Coverage 88.79% 83.16% -5.63%
==========================================
Files 12 12
Lines 562 600 +38
==========================================
Hits 499 499
- Misses 63 101 +38
Continue to review full report at Codecov.
|
It's probably because of To show you what I mean, here's a screen shot from Cthulhu with optimizations on, but with verbose turned off to highlight just the red lines: Don't worry about the apparent inference failures for (ifd[STRIPBYTECOUNTS]::SomeType).data is to be preferred to (ifd[STRIPBYTECOUNTS].data)::SomeOtherType In #36 I didn't feel confident I knew the types so I asserted at a later stage. |
Ah, cool. Adding the earlier asserts gets rid of almost all allocations (no improvement in speed but we're probably I/O limited here): julia> @time img[:, :, 1] .= 1.0;
0.383113 seconds (9 allocations: 320 bytes) That said, the flexibility of TIFFs is pretty difficult here because there aren't a lot of guarantees. I guess what I could do is parameterize the mmapped TIFF type on the on-disk structure of the image? Because then I don't need to rely on if/else statements in |
Yeah, it's a tricky design space. You're going to have to have something be non-inferrable here---Julia can't predict what's on disk. The key issue comes down to two questions:
When the two points above are in conflict with one another, often one of your best strategies is to coerce to a standard. I don't know much about this package or about the structure of TIFF files, but if they are essentially metadata then if you can standardize your metadata on just a few types (e.g., |
BTW, it's possible you could at least make the code cleaner with a variant of https://timholy.github.io/SnoopCompile.jl/stable/snoopr/#Inferrable-field-access-for-abstract-types. Basically you could apply the type-assert for particular tag keys in a |
Do you have any more information on this? I think I know what you're getting at, but not 100% sure. As for the TIFF diversity, the problem stems from the fact that each tag can have its own datatype and there are quite a few native types: Lines 123 to 140 in 3d1d5c3
BackgroundMy original solution was to store all tags using the same memory layout (as a Vector of raw bytes) and reinterpret them into the correct datatype on access. I changed this in #22 to the current design where I parameterize the Potential solution
This sounds like a good middle road. I only regularly use a minimal core set of tags for performance-critical code, things like width, length, data offsets, etc. I could coerce those into a limited set of expected datatypes on EDIT: *I think on read could be useful here. I can force the datatype for these limited tags and then assert their type in |
You may know about these already but:
That sounds like a good idea. I am guessing that this is quite a small amount of data, right? Could you also read it when you open the file and copy it in coerced form to a single vector in RAM? Or do that image-by-image when you first read a particular frame? |
This will complete
TiffImages.jl
support for mmapped TIFFs. There are still some features that are needed before this is ready to merge:Allocations
Despite my best efforts
setindex!
is still not zero allocation, despite no allocations reported by running julia in--track-allocations=all
mode.To reproduce:
Thoughts @timholy, @IanButterworth?