Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHC-supplied hsc2hs executable fails if non-ISO/IEC 8859-1 (Latin-1) code points are in the path #96

Open
mpilgrem opened this issue Dec 13, 2024 · 2 comments

Comments

@mpilgrem
Copy link

mpilgrem commented Dec 13, 2024

This issue applies to the executable provided with, at least, GHC 9.4.8, 9.6.6, 9.8.4 and 9.10.1.

On Windows 11 in Windows Terminal, with Hebrew characters (a right-to-left language):

❯ D:\שזדס\sr-test\programs\x86_64-windows\ghc-9.8.4\bin\hsc2hs.exe --verbose --cc=D:\שזדס\sr-test\programs\x86_64-windows\ghc-9.8.4\mingw\bin\clang.exe --ld=D:\שזדס\sr-test\programs\x86_64-windows\ghc-9.8.4\mingw\bin\clang.exe -o Dummy.hs Dummy.hsc
Executing: (@./\hsc7C24.rsp) D:\\שזדס\\sr-test\\programs\\x86_64-windows\\ghc-9.8.4\\mingw\\bin\\clang.exe -c ./Dummy_hsc_make.c -o ./Dummy_hsc_make.o
hsc2hs-ghc-9.8.4.exe: fd:3: hGetContents: invalid argument (cannot decode byte sequence starting from 233)

Dummy_hsc_make.c is created and starts:

#include "D:\����\sr-test\programs\x86_64-windows\ghc-9.8.4\lib\template-hsc.h"

With Cyrillic characters (a left-to-right language):

❯ D:\Майк\sr-test\programs\x86_64-windows\ghc-9.8.4\bin\hsc2hs.exe --verbose --cc=D:\Майк\sr-test\programs\x86_64-windows\ghc-9.8.4\mingw\bin\clang.exe --ld=D:\Майк\sr-test\programs\x86_64-windows\ghc-9.8.4\mingw\bin\clang.exe -o Dummy.hs Dummy.hsc
Executing: (@./\hsc2E30.rsp) D:\\Майк\\sr-test\\programs\\x86_64-windows\\ghc-9.8.4\\mingw\\bin\\clang.exe -c ./Dummy_hsc_make.c -o ./Dummy_hsc_make.o
compiling ./Dummy_hsc_make.c failed (exit code 1)
rsp file was: "./\\hsc2E30.rsp"
command was: D:\\Майк\\sr-test\\programs\\x86_64-windows\\ghc-9.8.4\\mingw\\bin\\clang.exe -c ./Dummy_hsc_make.c -o ./Dummy_hsc_make.o
error: ./Dummy_hsc_make.c:1:10: fatal error: 'D:\09:\sr-test\programs\x86_64-windows\ghc-9.8.4\lib\template-hsc.h' file not found
#include "D:\<U+001C>09:\sr-test\programs\x86_64-windows\ghc-9.8.4\lib\template-hsc.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Dummy.hsc is simply:

module Dummy where

dummy :: IO ()
dummy = pure ()

The expected behaviour is that hsc2hs handles all valid paths on platforms supported by GHC.

(The context is that a Stack user reported this as an issue:

@mpilgrem
Copy link
Author

mpilgrem commented Dec 14, 2024

I think the general issue is not Windows-specific. With Ubuntu 24.04.1 LTS (via WSL2) in Windows Terminal:

$ /home/mpilgrem/.stack/שזדס/programs/x86_64-linux/ghc-tinfo6-9.8.4/bin/hsc2hs --verbose --cc=/usr/bin/gcc --ld=/usr/bin/gcc -o Dummy.hs Dummy.hsc
Executing: (@./hsc2hscall56604-0.rsp) /usr/bin/gcc -c ./Dummy_hsc_make.c -o ./Dummy_hsc_make.o -I/home/mpilgrem/.stack/שזדס/programs/x86_64-linux/ghc-tinfo6-9.8.4/include/include/
hsc2hs-ghc-9.8.4: fd:6: hGetContents: invalid argument (cannot decode byte sequence starting from 233)

Dummy_hsc_make.c is created and starts:

#include "/home/mpilgrem/.stack/����/programs/x86_64-linux/ghc-tinfo6-9.8.4/lib/ghc-9.8.4/lib/template-hsc.h"

@mpilgrem
Copy link
Author

mpilgrem commented Dec 14, 2024

I think part of the problem could be as simple as DirectCodegen.outputDirect uses Common.writeBinaryFile which, in turn, uses System.IO.withBinaryFile and that, effectively, restricts the input String to ASCII and the ISO/IEC 8859-1 (Latin-1) extension.

@mpilgrem mpilgrem changed the title GHC-supplied hsc2hs executable fails if non-ASCII characters are in the path GHC-supplied hsc2hs executable fails if non-ISO/IEC 8859-1 (Latin-1) code points are in the path Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant