Skip to content

Perceptual JPEG encoder, optimized with CUDA&OpenCL, full JPEG format support.

License

Notifications You must be signed in to change notification settings

doterax/guetzli-cuda-opencl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Guetzli

Introduction

Guetzli is a JPEG encoder that aims for excellent compression density at high visual quality. Guetzli-generated images are typically 20-30% smaller than images of equivalent quality generated by libjpeg. Guetzli generates only sequential (nonprogressive) JPEGs due to faster decompression speeds they offer.

Build Status

About this Repo

Guetzli is an awesome jpeg encoder, however, it works a little bit slow. In order to speed it up, we have added CUDA & OpenCL support for Guetzli, optimized some procedure and added full jpeg format support. We tested it on our GPU server(single Tesla M40 GPU), and the test result with one of our sample pictures(750*400 in size) is as below.

Method Usage Cost
Original guetzli <in file> <out file> 14.7s
Procedure Optimized guetzli --c <in file> <out file> 8.2s
Using OpenCL guetzli --opencl <in file> <out file> 1.5s
Using CUDA guetzli --cuda <in file> <out file> 0.8s

Check the 'Extra features' section to see how to enable CUDA or OpenCL.

Building

On POSIX systems (outdated and not tested yet)

  1. Get a copy of the source code, either by cloning this repository, or by downloading an archive and unpacking it.
  2. Install libpng. If using your operating system package manager, install development versions of the packages if the distinction exists.
    • On Ubuntu, do apt-get install libpng-dev.
    • On Fedora, do dnf install libpng-devel.
    • On Arch Linux, do pacman -S libpng.
    • On Alpine Linux, do apk add libpng-dev.
  3. Run make and expect the binary to be created in bin/Release/guetzli.

On Windows

  1. Get a copy of the source code, either by cloning this repository, or by downloading an archive and unpacking it.
  2. Install Visual Studio 2015 and vcpkg
  3. Install libpng using vcpkg: .\vcpkg install libpng:x64-windows-static.
  4. Install libpng using vcpkg: vcpkg install tiff:x64-windows-static.
  5. Cause the installed packages to be available system-wide: .\vcpkg integrate install. If you prefer not to do this, refer to vcpkg's documentation.
  6. Open the Visual Studio project enclosed in the repository and build it.

On macOS (outdated and not tested yet)

To install using Homebrew:

  1. Install Homebrew
  2. brew install guetzli

To install using the repository:

  1. Get a copy of the source code, either by cloning this repository, or by downloading an archive and unpacking it.
  2. Install Homebrew or MacPorts
  3. Install libpng
    • Using Homebrew: brew install libpng.
    • Using MacPorts: port install libpng (You may need to use sudo).
  4. Run the following command to build the binary in bin/Release/guetzli.
    • If you installed using Homebrew simply use make
    • If you installed using MacPorts use CFLAGS='-I/opt/local/include' LDFLAGS='-L/opt/local/lib' make

With Bazel (outdated and not tested yet)

There's also a Bazel build configuration provided. If you have Bazel installed, you can also compile Guetzli by running bazel build -c opt //:guetzli.

Using

Note: Guetzli uses a large amount of memory. You should provide 300MB of memory per 1MPix of the input image.

Note: Guetzli uses a significant amount of CPU time. You should count on using about 1 minute of CPU per 1 MPix of input image.

Note: Guetzli assumes that input is in sRGB profile with a gamma of 2.2. Guetzli will ignore any color-profile metadata in the image.

To try out Guetzli you need to build or download the Guetzli binary. The binary reads a PNG or JPEG or TIFF image and creates an optimized JPEG image:

guetzli [--quality Q] [--verbose] original.png output.jpg
guetzli [--quality Q] [--verbose] original.jpg output.jpg
guetzli [--quality Q] [--verbose] original.tiff output.jpg

Note that Guetzli is designed to work on high quality images. You should always prefer providing uncompressed input images (e.g. that haven't been already compressed with any JPEG encoders, including Guetzli). While it will work on other images too, results will be poorer. You can try compressing an enclosed sample high quality image.

You can pass a --quality Q parameter to set quality in units equivalent to libjpeg quality. You can also pass a --verbose flag to see a trace of encoding attempts made.

Please note that JPEG images do not support alpha channel (transparency). If the input is a PNG with an alpha channel, it will be overlaid on black background before encoding.

Extra features

Note: Please make sure that you can build guetzli successfully before adding the following features.

Enable CUDA support

Note: Before adding CUDA support, please check whether your GPU support CUDA or not.

Note: If you don't have an NVIDIA card that support CUDA, you can try OpenCL instead. You can install any of the OpenCL SDKs, such as Intel OpenCL SDK, AMD OpenCL SDK, etc.

Note: The steps for adding OpenCL support is very similar with adding CUDA support, so the following introduction will be only for CUDA.

On POSIX systems

  1. Follow the Installation Guide for Linux to setup CUDA Toolkit.
  2. Edit premake5.lua, add $(CUDA_PATH)\include to includedirs under workspace "guetzli", add defines { "__USE_CUDA__" } and links { "cuda" } under filter "action:gmake". Then do premake5 --os=linux gmake to update the makefile.
  3. Edit clguetzli/clguetzli.cl and add #define __USE_CUDA__ at first line.
  4. Run make and wait the binary to be created in bin/Release/guetzli.
  5. Run ./compile.sh 64 or ./compile.sh 32 to build the 64 or 32 bits ptx file, and the ptx file will be copied to bin/Release/clguetzli.

On Windows

  1. Follow the Installation Guide for Microsoft Windows to setup CUDA Toolkit.
  2. Copy <vs2015 dir>\VC\bin\amd64\vcvars64.bat to <guetzli dir>\vcvars64.bat.
  3. Open the Visual Studio project and edit the project Property Pages as follows:
    • Add __USE_CUDA__ to preprocessor definitions.
    • Add cuda.lib to additional dependencies.
    • Add $(CUDA_PATH)\include to include directories.
    • Add $(CUDA_PATH)\lib\Win32 or $(CUDA_PATH)\lib\x64 to library directories.
  4. Edit clguetzli/clguetzli.cl and add #define __USE_CUDA__ at first line.
  5. Build.

Enable OpenCL support

On POSIX systems

  1. Follow the Installation Guide for Linux to setup Intel OpenCL SDK.
  2. Edit premake5.lua, add $(OPENCL_SDK_PATH)\include to includedirs under workspace "guetzli", add defines { "__USE_OPENCL__" } and links { "**" } under filter "action:gmake". Then execute premake5 --os=linux gmake to update the makefile.
  3. Edit clguetzli/clguetzli.cl and add #define __USE_OPENCL__ at first line.
  4. Run make and wait the binary to be created in bin/Release/guetzli.
  5. Copy clguetzli/clguetzli.cl to bin/Release/clguetzli before running.

On Windows

  1. Follow the Installation Guide for Microsoft Windows to setup Intel OpenCL SDK.
  2. Copy <vs2015 dir>\VC\bin\amd64\vcvars64.bat as <guetzli dir>\vcvars64.bat
  3. Open the Visual Studio project and edit the project Property Pages as follows:
    • Add __USE_OPENCL__ to preprocessor definitions.
    • Add OpenCL.lib to additional dependencies.
    • Add $(OPENCL_SDK_PATH)\include to include directories.
    • Add $(OPENCL_SDK_PATH)\lib\x86 or $(OPENCL_SDK_PATH)\lib\x64 to library directories.
  4. Edit clguetzli/clguetzli.cl and add #define __USE_OPENCL__ at first line.
  5. Edit Property Pages to turn on the Excluded From Build property of clguetzli/clguetzli.cu.
  6. Build.

Enable 'tcmalloc' support to speed up memory access. ('tcmalloc' can save 3% encoding time.)

On POSIX systems

  1. Dowload google-perftools from https://github.com/gperftools/gperftools/releases
  2. Install google-perftools: ./configure && make && make install
  3. Add '/usr/local/lib' to library path if not exist
    • echo "/usr/local/lib" > /etc/ld.so.conf.d/usr_local_lib.conf
    • ldconfig
  4. Edit 'premake5.lua', add 'tcmalloc' to links { ** } under filter "action:gmake". Then execute premake5 --os=linux gmake to update the makefile.
  5. On 64 bits system, libunwind is required
  6. Run 'make'

On Windows

  1. Follow the READdME file [https://github.com/gperftools/gperftools/blob/master/README_windows.txt] to use tcmalloc_minimal
  2. Build.

Usage

guetzli [--c|--cuda|--opencl] [other options] original.png output.jpg
guetzli [--c|--cuda|--opencl] [other options] original.jpg output.jpg
guetzli [--c|--cuda|--opencl] [other options] original.tiff output.jpg

You can pass a --c parameter to enable the procedure optimization or --cuda parameter to use the CUDA acceleration or --opencl to use the OpenCL acceleration.

If you have any question about CUDA/OpenCL support, please contact [email protected], [email protected], [email protected] or [email protected].

Enable full JPEG format support

On POSIX systems

  1. Install libjpeg. If using your operating system package manager, install development versions of the packages if the distinction exists.
    • On Ubuntu, do apt-get install libjpeg8-dev.
    • On Fedora, do dnf install libjpeg-devel.
    • On Arch Linux, do pacman -S libjpeg.
    • On Alpine Linux, do apk add libjpeg.
  2. Edit premake5.lua, add defines {"__SUPPORT_FULL_JPEG__"} and links { "jpeg" } under filter "action:gmake". Then do premake5 --os=linux gmake to update the makefile.
  3. Run make and wait the binary to be created in bin/Release/guetzli

On Windows

  1. Install libjpeg-turbo using vcpkg: .\vcpkg install libjpeg-turbo:x64-windows-static
  2. Open the Visual Studio project and add __SUPPORT_FULL_JPEG__ to preprocessor definitions in the project Property Pages.
  3. Build.

About

Perceptual JPEG encoder, optimized with CUDA&OpenCL, full JPEG format support.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 94.2%
  • C++ 5.5%
  • Makefile 0.2%
  • Batchfile 0.1%
  • POV-Ray SDL 0.0%
  • Shell 0.0%