Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Texture enhancement without HQ event signals #23

Open
zhangyb opened this issue Oct 3, 2024 · 5 comments
Open

Texture enhancement without HQ event signals #23

zhangyb opened this issue Oct 3, 2024 · 5 comments

Comments

@zhangyb
Copy link

zhangyb commented Oct 3, 2024

Hi @DachunKai ,

Congratulations on your impressive work, and thank you for sharing it!

I have a question regarding using HQ event data with LQ sources for texture enhancement. In my tests, I achieved great results using HQ events from the city HQ source (704x576) to perform VSR on a city LQ source (176x144).

Comparison with Topaz VSR software using LQ source

city-compare-topaz-evtexture.mov

However, when applying the same HQ events to the HQ source itself (704x576), the benefits were less pronounced.

Is this outcome expected, or could I be overlooking something in the process? I’d appreciate your insights.

Thanks again for your work!

@zhangyb zhangyb changed the title Texture enhancement without HR event signals Texture enhancement without HQ event signals Oct 3, 2024
@DachunKai
Copy link
Owner

@zhangyb Hi, I'm glad to hear about your interest in our work! It's exciting to see your experiment comparing our EvTexture with the Topaz video enhancer in your demo.

Before I address your question, I have one about the resolution requirements for our network's input events and images. According to EvTexture_arch.py, our model needs both inputs to have the same resolution, either 176x144 or 704x576. How did you manage to convert the input to high-quality (HQ) events at 704x576 and low-quality (LQ) images at 176x144? Our test data only includes LQ events at 176x144. Could you explain how you obtained HQ events at 704x576?

For your second question, when you mention using the same HQ events with the HQ source (704x576), are you referring to upscaling 704x576 to a 2816x2304 video format?

@zhangyb
Copy link
Author

zhangyb commented Oct 3, 2024

Hi @DachunKai ,

Thank you for your response.

To clarify, I obtained the HQ events (704x576) by passing the HQ images (704x576) to the event simulator (esim_py), and then downsampled the events to match the LQ resolution (176x144). Both the LQ images and the downsampled event data at 176x144 were passed to the model for inference.

Regarding my second question, yes, I was talking about using the same HQ events and HQ images at 704x576 to upscale the video to 2816x2304.

Thanks again for your time and insights!

@DachunKai
Copy link
Owner

Hi @zhangyb ,

I tested it just now, and yes, our EvTexture can super-resolve the city clip from 704x576 to 2816x2304. Your process might be correct, and my test results also didn't meet my expectations🥹. I've uploaded the inference results from 704x576 to 2816x2304 in the Baidu Cloud(n8hg), and since no ground truth is available, I calculated the NIQE metric.

I suspect there are two main reasons why our EvTexture is not performing satisfactorily:

  1. Super-resolving the original city clip from 704x576 to 2816x2304 is akin to a real-world VSR task, like RealBasicVSR. We followed BasicVSR and BasicVSR++, training our model assuming only bicubic degradation. I guess if we included more types of degradation in the training data, like real-world VSR models, our model would perform better in this scenario.

  2. During training, our EvTexture was upscaling from 64x64 to 256x256. There's a gap when upscaling from 704x576 to 2816x2304. I suspect that introducing larger resolution patches during training could significantly help with testing on such high-resolution videos, but this would be GPU-intensive.

Regardless, sincerely thank you very much for identifying this limitation. We hope to make some adjustments to our training data in the future, such as introducing more degradation types or using larger patches, to improve our model.

@zhangyb
Copy link
Author

zhangyb commented Oct 4, 2024

Hi @DachunKai ,

Thank you for sharing your test results on the 704x576 clip! Unfortunately, I can’t access Baidu Cloud. Would it be possible for you to upload them to Google Drive or another service?

Regarding your second point, I experimented with slicing the 704x576 source into 64x64 patches before running them through the model and then reconstructing the upscaled patches afterwards. I found no significant difference between the sliced approach and the non-sliced source, except for edge artefacts observed in a separate experiment using the 176x144 source. These artefacts were mitigated by overlapping patches, which interestingly reduced LPIPS slightly, though the effect was minor and insignificant.

In case you are interested, I’ve uploaded my inference results for the 704x576 source sliced into 64x64 patches, upscaled to 256x256, and then reconstructed to 2816x2304 here: Google Drive link.

Out of curiosity, I also compared Topaz with EvTexture’s results and juxtaposed the differences using the voxels.
Difference between topaz VSR and Evtexture - 704x576 - image diff vs voxels

Thanks again for all your work on EvTexture. I’m glad to contribute feedback and am excited to see how the model evolves with further training and improvements.

@DachunKai
Copy link
Owner

Hi, @zhangyb

You can access the VSR results of the Vid4 dataset, upscaled from 704x576 to 2816x2304, in the following Google Drive link.

Furthermore, I find your observations quite interesting. Indeed, utilizing patches may be a viable approach. Your experiments on non-overlapping and overlapping patches are insightful. There might be further exploration potential in understanding the spatial patch processing and the impact of training data augmentation on such cases.

Thank you for your highly valuabe feedback, and we look forward to further exploration and improvements in our VSR model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants