Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Canary] Transcoding Support #3489

Merged
merged 46 commits into from
Jan 4, 2025
Merged

[Canary] Transcoding Support #3489

merged 46 commits into from
Jan 4, 2025

Conversation

majora2007
Copy link
Member

@majora2007 majora2007 commented Jan 4, 2025

Canary testers, I need your help! This is a massive PR that swaps out the Image system in Kavita and brings Transcoding support (Thanks @maxpiva). This is built from the nightly. Please give it a serious stress test and report back on the nightly thread. There is also a Github issue for tracking longer term issues here.

Added

  • Added: Added support for JPEG-XL, JPEG 200, and HEIF image formats
  • Added: Kavita now has transcoding support. This means Kavita will respect Accept Headers and if the target browser/device doesn't support the image, Kavita will convert it on the fly. From testing this is not noticeable and without quality drop.

Changed

  • Changed: Kavita should no longer require SSE4.2 CPU Instructions and should be able to run on older hardware.
  • Changed: Due to the massive overhaul to the image manipulation library, smart cropping (mainly visible on oddly-shaped cover images) was changed from NetVips to SmartCrop.js C# library. Minor differences from testing.

Developer

  • Swapped out NetVips with ImageMagick (Magick.NET)
  • Updated Flurl.Http with a new configuration
  • Removed NetVips and SixLabors.ImageSharp

maxpiva and others added 30 commits August 15, 2024 08:25
Fixed test and benchmark DI.
Point docker-build to another hub
Edit dockerfile and include jxl package.
1. Supported Image Formats came in the accept flag from the browser.
2. If jxl (or future formats, ie browser not supported AVIF, HEIF, etc) are not supported, Kavita will convert it to a suitable display format via conversion providers.
3. if it supported it will return the image as is, no conversion needed.
4. Thumbnails are always converted.
1. Supported Image Formats came in the accept flag from the browser.
2. If jxl (or future formats, ie browser not supported AVIF, HEIF, etc) are not supported, Kavita will convert it to a suitable display format via conversion providers.
3. if it supported it will return the image as is, no conversion needed.
4. Thumbnails are always converted.
Fix GetCoverImage, trying to create a temporary image in current directory, and rename the parameter field name, that generates a wrong asumption.
Optimize Cover Creation
Removing ImageConvertors (No Longer Needed)
Refactored mime types and extensions constants
Enable back docker build/push
maxpiva and others added 16 commits October 4, 2024 15:33
Added Per Image Lock on conversion, to prevent multiple threads from processing the same image at the same time.
Added Quality From EncodedFormat Extension
Q8 is a lot faster new DefineConstants Q8, Q16 and Q16HDRI can be passed to build if you want any variation, default is Q8.
Q16 and Q16HDRI is desirable when the ouput screens are >8 bit, and the initial format support more than 8Bits. IE: AVIF, HEIF or JPGXL.
Fix merge bugs.
Added/Updated nugets, to remove transient security vulnarabilities from other nugets.
@majora2007 majora2007 changed the base branch from develop to canary January 4, 2025 13:44
@majora2007 majora2007 added the enhancement New feature or request label Jan 4, 2025
@majora2007 majora2007 merged commit 6cba7d1 into canary Jan 4, 2025
6 checks passed
@majora2007 majora2007 deleted the feature/transcoding branch January 4, 2025 13:58
@majora2007
Copy link
Member Author

@maxpiva Here is some follow up from the user testing. Overall from the transcoding standpoint, it is working without much delay.

What we found is though, ImageMagick is MUCH slower in generating images than NetVips, so much so that it is likely something we can't keep 100%.

DieselTech:
And mine was 90 seconds on netvips and 150 seconds on canary
Also note this is 588 chapters, so about 250ms per file

Dup:
I ran the same series as Diesel. 2 min 30s on netvips, 13 min on canary to refresh covers.

Specs of Dup:
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 10
On-line CPU(s) list: 0-9
Vendor ID: ARM
BIOS Vendor ID: QEMU
Model name: Neoverse-N1

Cropping testing was limited (canary code on right):
image

Testing has been pretty limited due to the time of year. I'm curious of your thoughts about next steps. The drop off on cover generation (which is already the most expensive process during a scan) is concerning to me.

@maxpiva
Copy link

maxpiva commented Jan 10, 2025

Let me sit on this a little.

From the netvips developer it's possible to add jpeg-xl and others. I was not able to do it, but I didn't go entirely into the rabbit hole, also netvips developer, said, it's possible to add netvips without requiring the CPU extensions, if installed natively, instead using his native nuget. Finally, netvips, dosn't have the capability to read image dimensions without reading the whole image, at least I cannot find the method. I think that functionality was using ImageSharp originally on Kavita or reading it whole.

From the code standpoint, I did abstract all the image functionality into an interface and implementation class, so adding netvips back should be super easy.

Work to DO

  • Add Netvips Back as an implementation.

  • Feasibility of adding natively netvips with jpeg-xl in docker. making sure, there is NO CPU extensions limitation. And custom toolchain/script on Windows, that download native X64 builds with JPEG-XL, JPEG2000 et all, instead using netvips native nuget.

  • Check, if possible, to get image identification/dimensions without loading the whole image in netvips.

Finally ask the developer if he maintain a native nuget, with all the bells and whistles.

Later we can choose, what implementation to use. Via define or similar.

@majora2007
Copy link
Member Author

Let me add that the SSE4.2 limitation is not a deal breaker for me. All CPUs in the last 15 years have this. It was a nice to have if we were moving to ImageMagick, but with the difference in speed, I think this will become a major issue for end users, especially those with lower-powered machines.

Let me know if I can help in any regard.

@DieselTech
Copy link
Collaborator

DieselTech commented Feb 18, 2025

I wanted to loop back to this and give some clarification on things that hopefully shine some light on the problem regarding the performance.

What we found is though, ImageMagick is MUCH slower in generating images than NetVips, so much so that it is likely something we can't keep 100%.

Dup and I both tested on the same exact series (identical files even) to ensure consistency. The series has 588 chapters - 2.96 GB.

DieselTech:
90 seconds on netvips - ~150ms per file
150 seconds on canary - ~250ms per file

Dup:
250 seconds on netvips - ~425ms per file
780 seconds on canary - ~1320ms per file

The "issue" as it stands here is the cover generation is single threaded. I am using a 5900x which has a single core score of 3470 as measured by cpubenchmark.net. That is fairly high considering the top end of consumer CPU's top out about 4800-5000 points. (And you'll be paying big $ for them)

I'd wager most people aren't running on something that powerful these days. Especially when it comes to low power servers that are designed to be left on 24/7. I'd ballpark the average single core performance around 1600 to 2500 or so. Here is where we get to my point: The CPU dup is using is following the new trend of low power (watts, not performance) CPU's.

Specs:
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 10
On-line CPU(s) list: 0-9
Vendor ID: ARM
BIOS Vendor ID: QEMU
Model name: Neoverse-N1

The single core performance of this chip is kind of abysmal at only 755. That is on par with a Pentium 4 that came out in 2008. But...that is why it has 10 threads. The multi-core performance is where this CPU shines because this isn't 2008 anymore and we expect workloads to be threaded in order to get work done in a reasonable amount of time.

This is especially true in cases like here where parallelization in this context is just having a single thread do 1 image at a time. It isn't a competing data source that would have to wait for the results of the work to used somewhere else. If we would of ran the same 588 file cover gen on all 10 cores then the approximate time would have been ~25 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants