-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda accelerated tonemap Filter #5
Comments
major bodge, but the new version of /cuda_filter/vf_scale_cuda.cu should be able to do rein hard tonemapping. the below is an extract from the ffmpeg devel mailing list relevant to this: For ease of developement, I've kept everything the same including the name of the filter, only changing the function within the file. This is very much a bodge to facilitate development. As such, for testing, this file should replace the vf_scale_cuda.cu file in ffmpeg/libavfilter/vf_scale_cuda.cu FFmpeg should then be compiled as standard for cuda filters and should be called as you would call the standard vf_scale_cuda filter. The above should decode in hardware, tonemap the frame on gpu and re-encode in hardware at a given bitrate. |
Used overlay filter as base instead of scale- seems much better for my purposes. reach out to @znmeb to ask for a test on his side. syntax will be: This is a bodge for now, since it's only modifying the output of the cuda kernel itself to use it, replace ffmpeg/libavfilter/vf_overlay_cuda.cu with this file: https://github.com/Camofelix/Jetson_ffmpeg_trancode_cluster/blob/master/cuda_filter/vf_overlay_cuda.cu It compiles fine, but can't test without a nano for actual usage. will require to self build: git fetch ffmpeg source code make clean ./configure --enable-nonfree --enable-cuda mv ~/path/to/new/file ~/path/to/ffmpeg/libavfilter/vf_overlay_cuda.cu make -j get coffee ffmpeg -i INPUT -i INPUT -filter_complex 'hwupload_cuda,overlay_cuda' OUTPUT ffplay output is output different? |
Further update: |
In my test, opencl is necessary. |
you just need to
|
I haven't looked at this project in a while, been working on other GPGPUrelated things. Have they ported the cuda filters to work on the nano? |
I don't know. I just test and finally build successfully. |
Command: sudo ffmpeg -init_hw_device cuda=gpu:0.0 -filter_hw_device gpu -c:v h264_nvmpi -i /mnt/source/Paprika.2006.JAPANESE.1080p.BluRay.x264.DTS-FGT.mkv -vf "format=yuv420p,hwupload,scale_cuda=1280:720,hwdownload,format=yuv420p" -c:v hevc_nvmpi -b:v 6000k -preset medium -profile:v high -acodec ac3 output.mp4 |
Also what i should say is that my device is throttled due to low current and voltage.😅 The power is utter garbage. |
cuvid is unusable because lacking of libnvcuvid.so.1 |
Initial idea of using POCL as a cuda translation layer isnt viable because of POCL not working with image formats on cuda.
Currently reaching out to Yasroslav Pogrebnyak, the developer of the VF_overlay_cuda ffmpeg filter.
Reaching out to nyanmisaka. Seem's to have a lot of experience working on FFmpeg filters and frankly knows more than I do.
In addition to this, collaborating with Ed Borasky to confirm function on jetson platforms.
vf_tonemap_cuda.txt
(renamed from .c to .txt to make github happy )
Missing: tonemap.cu with proper kernel side code. this is easy once I know how to properly call the cuda kernel side from the ffmpeg side.
Standard stride blocks should work, define total amount of blocks using height. most resolution will be 16:9, so by using height parameter, we have a higher chance of hitting divisible by 3 cleanly, so we can take advantage of cuda language data structure.
Other option is taking the R G and B value of a given pixel which is guaranteed to be *3. this might also help for other tone mapping algorithms that use relative offset from local peak luma as input for tonemapping output
The text was updated successfully, but these errors were encountered: