Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Happy-generated code fails with Internal Happy error with -XStrictData #273

Closed
BjoernLoetters opened this issue Mar 14, 2024 · 11 comments · Fixed by #289
Closed

Happy-generated code fails with Internal Happy error with -XStrictData #273

BjoernLoetters opened this issue Mar 14, 2024 · 11 comments · Fixed by #289

Comments

@BjoernLoetters
Copy link

BjoernLoetters commented Mar 14, 2024

I am new to Happy and want to use it for an (old school) introductory course on compiler engineering. Because I repetitively got the very uninformative Internal Happy error error, I tested my whole setup with a minimal example. Surprisingly, it still fails with Internal Happy error.

Here is my setup:

module Main where

import Token
import Parser

main :: IO ()
main = do
  result <- parse [ VAR ]  
  print result -- should print 0
  return ()
module Token where

data Token = VAR deriving (Show, Eq)
{
module Parser where 

import System.IO
import Token
}

%name parse
%tokentype { Token }
%error { parseError }
%monad { IO } { >>= } { return }

%token
  VAR     { VAR }

%%

program : VAR { 0 }

{
parseError :: [Token] -> IO a
parseError _ = fail "something went wrong" -- I never see this error message ...
}

After diving a bit into the code I saw that parse invokes happyParse start_state = happyNewToken start_state notHappyAtAll notHappyAtAll with start_state = 0#. It seems like that somewhere deep inside the machinery the second notHappyAtAll is actually evaluated by Haskell, since it is accessed as a HappyStk here:

happyFail explist 0# tk old_st _ stk@(x `HappyStk` _) =
     let i = (case Happy_GHC_Exts.unsafeCoerce# x of { (Happy_GHC_Exts.I# (i)) -> i }) in
--      trace "failing" $ 
        happyError_ explist i tk

As far as I understand, this means that Happy fails with an empty stack - probably right before even reading the first token.

Here is some information about my environment:

  • Late 2023 Mac with the M3 apple silicon
  • ghc --version prints 9.4.8
  • cabal --version prints 3.10.2.1
  • happy --version prints 1.20.1.1

My build.cabal:

cabal-version:      3.0
build-type:         Simple

...

executable HopefullyHappySoon:
  hs-source-dirs:     src
  main-is:            Main.hs
  other-modules:      Token,
                      Parser
  build-depends:      base ^>= 4.17.2.1, 
                      array >= 0.5.4.0
  build-tool-depends: happy:happy >= 1.20.1.1 -- Just to be sure ...
  default-language:   GHC2021

I appreciate any kind of help in this situation. This issue is also related to #241.

@BjoernLoetters
Copy link
Author

Just a thought that came to my mind right now: May this be an issue with some low level code failing? I run Happy on a Mac with Apple Silicon ...

@andreasabel
Copy link
Member

andreasabel commented Mar 14, 2024

I am assuming you are using the latest release of happy, 1.20.1.1.

$ happy --version
Happy Version 1.20.1.1 

On my old-school Mac, your example prints 0 just like expected, no internal error.

My package.yaml for this project:

executable:
  source-dirs: .
  main: Main

dependencies:
- base
- array

build-tools:
- happy

Also just checking: what is your GHC version?
See: https://www.haskell.org/ghc/blog/20210309-apple-m1-story.html

@andreasabel andreasabel added the info needed More information needed from the reporter label Mar 14, 2024
@BjoernLoetters
Copy link
Author

Thanks for your quick response and pardon the missing informations:

  • Late 2023 Mac with the M3 apple silicon
  • ghc --version prints 9.4.8
  • cabal --version prints 3.10.2.1
  • happy --version prints 1.20.1.1

My build.cabal:

cabal-version:      3.0
build-type:         Simple

...

executable HopefullyHappySoon:
  hs-source-dirs:     src
  main-is:            Main.hs
  other-modules:      Token,
                      Parser
  build-depends:      base ^>= 4.17.2.1, 
                      array >= 0.5.4.0
  build-tool-depends: happy:happy >= 1.20.1.1 -- Just to be sure ...
  default-language:   GHC2021

Please let me know if there is still information missing.

@BjoernLoetters
Copy link
Author

As the example works for you, I suppose it is somehow related to my environment. So, I tried the following steps in addition:

  • I installed LLVM@15 (llc --version prints 15.0.7) as recommended by ghcup ("On Darwin M1 you might also need a working llvm installed (e.g. via brew) and have the toolchain exposed in PATH.")
  • I reinstalled ghc-9.4.8
  • I switched to the slightly newer ghc-9.8.2

@andreasabel
Copy link
Member

Thanks for the extra information. This looks good in general, so I do not see an obvious point of attack.

Unfortunately, since I cannot produce it locally, I cannot go hunting for the problem myself.
I suggest you shrink the MWE by inlining the happy generated Haskell and then reducing it is further and further, making sure the same crash remains. Once it is down to a few lines of Haskell, maybe an issue could be reported to GHC, or it turns out that Happy is generating some code that breaks on the new M3 architecture.
Anyway, if you get such a minimal example, please post it here as well to keep me in the loop.

@BjoernLoetters
Copy link
Author

BjoernLoetters commented Mar 15, 2024

I followed your advice and tracked down the error by reducing the code in a call-by-name fashion. This was quite an adventure and the outcome is somewhat frustrating. TL;DR: I should have copied my whole .cabal-file into this issue (lesson learned!), instead of just writing down the "seemingly relevant" parts.

So, the story is as follows:
I had the StrictData extension enabled somewhere in the .cabal-file and instead of creating a new .cabal-file for the minimal example, I just copied this flag all along and did not notice it is still there, only sharing the "relevant" (as I thought) excerpt of the file with you.
The problem became apparent to me as soon as I rewrote data HappyStk a = HappyStk a (HappyStk a) to data HappyStk a = HappyStk a ~(HappyStk a). Without this lazy annotation the notHappyAtAll error is (of course 😮‍💨) evaluated when the code hits the pattern HappyStk _ _ in my example. Disabling the StrictData extension fixes this issue as well and the parser runs just fine.

Since Happy is a parser generator whose output must be compiled by the user, I think it would be great to mention this somewhere in the documentation (I could at least not find anything related to StrictData in the HTML documentation with a full-text search). Another approach would be to make Happy strict-safe. At the moment, it seems like it's only the notHappyAtAll error which should not be evaluated strictly. Since this error is not very informative anyway, it may be an option to handle the empty stack situations in a different way, not relying on Haskell's default laziness.

Anyway, thank you for your help and time!

@andreasabel
Copy link
Member

Oh, sorry to hear you had a long wild goose chase...

It would be nice if we could rewrite happy to make streams explicitly lazy:

data HappyStk a = HappyStk a ~(HappyStk a)

This is currently not possible without turning on StrictData, see this recent GHC issue:

Currently, we would have to turn on StrictData in the generated parser, which could have unforeseen consequences for the user code woven into the parser.

Of course, we could use ML-style explicit laziness:

data HappyStk a = HappyStk a (() -> HappyStk a)

which is a slap in the face of any Haskeller, haha.
One should investigate whether this would have some performance penalties, because we are replacing call-by-need with call-by-name.

@andreasabel
Copy link
Member

What we could do now is to add {-# LANGUAGE NoStrictData #-} to happy-generated code.

@andreasabel andreasabel added enhancement documentation and removed info needed More information needed from the reporter labels Mar 18, 2024
@andreasabel andreasabel changed the title Internal Happy error with a minimal example Happy-generated code fails with Internal Happy error with -XStrictData Mar 18, 2024
@andreasabel
Copy link
Member

The happy template defines two stream types that never make sense with StrictData:

data Happy_IntList = HappyCons FAST_INT Happy_IntList

data HappyStk a = HappyStk a (HappyStk a)

So I propose to add {-# LANGUAGE NoStrictData #-} to the template.
@int-index @Ericson2314 What do you think?

@donovancrichton
Copy link

donovancrichton commented Jun 13, 2024

Just confirming that I had this same issue and solved it by removing the StrictData extension from my cabal file. Thanks for writing this up!

@phadej
Copy link
Contributor

phadej commented Sep 13, 2024

It would be nice if we could rewrite happy to make streams explicitly lazy:

https://gitlab.haskell.org/ghc/ghc/-/issues/24455

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants