Junhao Tan, Songwen Pei, Wei Qin, Bo Fu, Ximing Li and Libo Huang
Abstract: Frequency information (e.g., Discrete Wavelet Transform and Fast Fourier Transform) has been widely applied to solve the issue of Low-Light Image Enhancement (LLIE). However, existing frequency-based models primarily operate in the simple wavelet or Fourier space of images, which lacks utilization of valid global and local information in each space. We found that wavelet frequency information is more sensitive to global brightness due to its low-frequency component while Fourier frequency information is more sensitive to local details due to its phase component. In order to achieve superior preliminary brightness enhancement by optimally integrating spatial channel information with low-frequency components in the wavelet transform, we introduce channel-wise Mamba, which compensates for the long-range dependencies of CNNs and has lower complexity compared to Diffusion and Transformer models. So in this work, we propose a novel Wavelet-based Mamba with Fourier Adjustment model called WalMaFa, consisting of a Wavelet-based Mamba Block (WMB) and a Fast Fourier Adjustment Block (FFAB). We employ an Encoder-Latent-Decoder structure to accomplish the end-to-end transformation. Specifically, WMB is adopted in the Encoder and Decoder to enhance global brightness while FFAB is adopted in the Latent to fine-tune local texture details and alleviate ambiguity. Extensive experiments demonstrate that our proposed WalMaFa achieves state-of-the-art performance with fewer computational resources and faster speed.
- Sep 21, 2024: Our paper has been accepted by ACCV 2024! 💥 💥 💥
- Jun 8, 2024: Pre-trained models are released!
- Jun 8, 2024: Codes is released!
- Jun 8, 2024: Homepage is released!
The overview of the WalMaFa architecture. Our model consists of an Encoder-Latent-Decoder structure that uses wavelet-based WMB to adjust global brightness during the Encoder and Decoder, and Fourier-based FFAB to adjust local details during the Latent.
- Create Conda Environment
conda create -n WalMaFa python=3.8
conda activate WalMaFa
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install matplotlib scikit-image opencv-python yacs joblib natsort h5py tqdm einops tensorboard pyyaml mamba_ssm
- Clone Repo
git clone https://github.com/mcpaulgeorge/WalMaFa.git
We provide the pre-trained models:
- WalMaFa trained on LOL [Google drive | Baidu drive]
You can directly test the pre-trained model as follows
- Modify the paths to dataset and pre-trained mode.
# Tesing parameter
input_dir # the path of data
result_dir # the save path of results
weights # the weight path of the pre-trained model
- Test the models for LOL dataset
You need to specify the data path input_dir
, result_dir
, and model path weight_path
. Then run
python test.py --input_dir your_data_path --result_dir your_save_path --weights weight_path
-
To download datasets training and testing data
-
To train WalMaFa, run
python train.py -yml_path your_config_path
This implementation is based on / inspired by:
- LLFormer: https://github.com/TaoWangzj/LLFormer
- RetinexFormer: https://github.com/caiyuanhao1998/Retinexformer
- SNR: https://github.com/dvlab-research/SNR-Aware-Low-Light-Enhance
- IAT: https://github.com/cuiziteng/Illumination-Adaptive-Transformer
If you find WalMaFa helpful, please cite our paper:
@InProceedings{Tan_2024_ACCV,
author = {Tan, Junhao and Pei, Songwen and Qin, Wei and Fu, Bo and Li, Ximing and Huang, Libo},
title = {Wavelet-based Mamba with Fourier Adjustment for Low-light Image Enhancement},
booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
month = {December},
year = {2024},
pages = {3449-3464}
}