Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-ASCII filenames under Windows #136

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Cvelth
Copy link

@Cvelth Cvelth commented Aug 23, 2020

This library uses const char* (std::string for C++ function) as an input type for filenames.
This works great under *nix systems, but fails to access files with non-ANSI characters in their paths when used under Windows (MSVC). This is caused by the fact that Windows uses utf-16 internally (instead of *nix's utf-8), characters of which cannot be stored in 8 bytes (char). To fix this issue, wchar_t needs to be used.

So, I implemented overloads for every function accepting filename, which are only available with MSVC (if _MSC_VER is defined in the environment) toolkit under Windows (if _WIN32 is defined), if LODEPNG_NO_COMPILE_DISK is not defined.

These overloads can, additionally be disabled by defining LODEPNG_NO_WIDE_CHAR_OVERLOADS.

I'm open to any suggestions and will try to implement additional fixes to the PR if proposed.

Looking forward to the reply.

P. S. I've additionally took liberty to fix (just run regex-powered auto-replace) an inconsistency in #endif commentaries where some of which were defined as #endif /* LODEPNG_COMPILE_DISK */ even though most of them were formatted as #endif /*LODEPNG_COMPILE_DISK*/.

@Cvelth Cvelth changed the title Add support for non-ANSI filenames under Windows Add support for non-ASCII filenames under Windows Aug 31, 2020
@mmp
Copy link

mmp commented Jun 2, 2021

Nice! Is there any chance of merging this, @lvandeve?

I was about to prepare a PR to support UTF8 encoded filenames on Windows for similar reasons but was going to follow the approach taken in stb_image, where char * filenames are optionally assumed to point to UTF8-encoded strings that are then converted to UTF16 before being passed to the corresponding Windows APIs that take UTF16 for filenames...

(I'm happy to go ahead and do that, if that would be preferable.)

@lvandeve
Copy link
Owner

Filenames and opening of files are platform dependent: a non-ANSI-C compatible function _wfopen is required for this.

lodepng uses pure ANSI C (except for the optional C++ parts) and tries to be as platform-independent as possible, even avoiding platform dependent #ifdefs where possible to avoid potential breakages (ANSI C is the biggest reliable constant here)

Perhaps it was a mistake to have filename-based functions at all inside of the library, as its main focus is to decode PNG images. It seemed like a good idea because C looks like it has the required functionality, except as it turns out not all platforms are compatible with it.

At this point, I'd rather add a warning to the ###_file functions that they don't work correctly with systems that use 16-bit filename characters, and eventually deprecate those functions, than widen the scope of the file loading for this.

lodepng can be used without the ###_file functions: Windows' API provides better functions for loading files in Windows, and then lodepng can be used to decode from the memory buffer. lodepng doesn't actually do anything efficient with files itself either: it also loads it into a memory buffer and then decodes from that

Would that work?

@Cvelth
Copy link
Author

Cvelth commented Jun 15, 2021

Thanks for your answer.

The thing is, for me personally, merging of this PR is not essential anymore, I used my forked version for a while, after which the project in question was moved to newer Windows version and as are result its code page could be converted to utf-8.

For those who are interested and/or encountered the same problem, you can read about it here. Take notice, it's only possible for applications built for Windows 10 Version 1903 (May 2019 Update) and newer.

P. S. Explicit warning are always a good idea, they will probably save someone painful debugging in the future.

@mmp
Copy link

mmp commented Jun 15, 2021 via email

@MarkCallow
Copy link

If you want to do anything about this, I recommend the approach, apparently taken in stb_image, described in an earlier comment by @mmp. I.e document that filenames are utf-8 and on Windows convert before calling the open functions. I have used this approach in my own application and library. It is far easier and cleaner than function overloads and dealing with Windows text-related macros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants