Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub CI: Add GHC 9.12 to the matrix #754

Merged
merged 2 commits into from
Jan 1, 2025
Merged

Conversation

quark17
Copy link
Collaborator

@quark17 quark17 commented Dec 20, 2024

No description provided.

@quark17
Copy link
Collaborator Author

quark17 commented Dec 20, 2024

The testsuite has this failure with GHC 9.12.1, due to dictionary definitions being reordered in a test that compares expected ISyntax:

--- DummyInDeflQual.bs.bsc-out.expected 2024-12-20 20:13:30.190107623 +0000
+++ DummyInDeflQual.bs.bsc-out  2024-12-20 21:20:56.945147002 +0000
@@ -21,12 +21,12 @@
            /\ (_tctyvar1063 :: *) .
            /\ (_tctyvar1064 :: *) .
            \ (_tcdict1050 :: Prelude.Literal _tctyvar1063) .
-           let _tcdict1042 :: Prelude.Literal (Prelude.Bit 2)
-               _tcdict1042 = Prelude.Prelude.Literal~Prelude.Bit~n= ·2
-               -- IdProp: _tcdict1042[IdP_bad_name,IdPDict]
-           in  let _tcdict1032 :: Prelude.Eq (Prelude.Bit 2)
+           let _tcdict1032 :: Prelude.Eq (Prelude.Bit 2)
                    _tcdict1032 = Prelude.Prelude.Eq~Prelude.Bit~n= ·2
                    -- IdProp: _tcdict1032[IdP_bad_name,IdPDict]
+           in  let _tcdict1042 :: Prelude.Literal (Prelude.Bit 2)
+                   _tcdict1042 = Prelude.Prelude.Literal~Prelude.Bit~n= ·2
+                   -- IdProp: _tcdict1042[IdP_bad_name,IdPDict]
                in  \ (x :: _tctyvar1064) .
                    PrimIf
                      ·_tctyvar1063

Both Ubuntu and macOS VMs have this same behavior. It would be worth investigating how this order is generated, and whether BSC could be doing something to enforce a canonical order (or if it is, and there's a bug in GHC 9.12.1). Alternatively, the test could be adjusted to not care about this order, but it'd still be worth understanding why the order changed, first.

@quark17
Copy link
Collaborator Author

quark17 commented Dec 21, 2024

The testcase dumps the "internal" stage (with -dinternal). Using -dall (and -KILLinternal to stop after the "internal" stage), we can get the outputs from all previous stages, to try to identify where the difference begins. BSC compiled with GHC 9.6.6 and BSC compiled with GHC 9.12.1 have identical output on this example up until the "internal" stage; so the difference is introduced there.

In fact, the definitions are in the correct order prior to "internal". For example, here are the defs dumped by the immediately prior stage ("simplified"):

let { _tcdict1042 :: Prelude.Literal (Prelude.Bit 2);
      _tcdict1042 =  Prelude.Prelude.Literal~Prelude.Bit~n ·2;
      _tcdict1032 :: Prelude.Eq (Prelude.Bit 2);
      _tcdict1032 =  Prelude.Prelude.Eq~Prelude.Bit~n ·2;

I would guess that the re-ordering happens from the use of SCC.tsort in IConv.iConvLet. A comment on that function says that it is not a stable sort (although it has always seemed to not reorder the list unnecessarily).

@quark17
Copy link
Collaborator Author

quark17 commented Dec 22, 2024

It's the graph argument to tsort that has changed, so tsort is off the hook. The graph that gets constructed is:

[ (_tcdict1042, []),
  (_tcdict1032, []),
  (_tctemp1065, [_tcdict1042, _tcdict1032]) ]

showing that _tctemp1065 depends on the other two. With GHC 9.12.1, that dependency list is swapped:

[ (_tcdict1042, []),
  (_tcdict1032, []),
  (_tctemp1065, [_tcdict1032, _tcdict1042]) ]

The dependency list comes from calling CFreeVars.getFVDl to get the free variables from the definition. And that uses Data.Set to collect up the names, and then Set.toList to return a List -- which uses the Ord instance, so the order of the list depends on that.

The Ord instance for Id is ultimately derived from the instance for SpeedyString. These are represented with a unique number in the order that they were created, and the Ord comparison is based on that unique number. So something must have changed in the execution order of the typecheck stage. I don't know if it's worth tracking down why it changed -- I guess it could be a bug in GHC, but probably not? it's probably just taking a different legal path?

If we want, we could sort the graph dependencies in alphabetical order, which would give the defs a canonical order in the linearized output. This isn't a concern for users -- we sort defs in the final Verilog output, for example -- this would only make things easier for developers or the testsuite, to be able to expect a canonical order of internal dumps. So maybe we just do the alternative, which is to update the test to not care about the order of the defs.

@quark17
Copy link
Collaborator Author

quark17 commented Dec 24, 2024

The discrepancy is created during compilation of the Prelude -- different numbers end up being used for variable names, which means that the Prelude.bo file used to contain a reference to _tcdict1042 but no longer does with GHC 9.12.1. So when BSC reads in the imports, it creates an ID with the name _tcdict1042 before it creates a reference for _tcdict1032, causing that to be the sorting order. With newer BSC, no _tcdict1042 name is in Prelude.bo, so no ID is created for name prior to _tcdict1032 being created, so the opposite order is created.

I have not identified why this discrepancy shows up. But name reordering seems to first show up in the syminitial stage, where (at least with dumping turned on) the order typeclass instances are listed in slightly different orders. The typecheck stage uses one list of fresh integers, but if it processed a list in a different order, then different _tcdict and _tctyvar numbers would end up in the final result (in the .bo file). The instance list is created in MakeSymTab.getQInsts, which uses tsort on a graph of instances. So I'm investigating that next.

@quark17
Copy link
Collaborator Author

quark17 commented Dec 25, 2024

When making the symbol table (MakeSymTab), getCls is called for every class, and it is passed a list of instances (qts :: QInst). The instances include names for the types, like "Prelude.Maybe" -- these strings haven't been created before, because the Id for the type would have Prelude and Maybe in separate fields (for qualifier and base name) and thus as separate strings. So the names in qts are first created when those elements of qts are evaluated.

Inside getCls, it calls getQInsts to create a list of instances, sorted by concreteness. This is done by creating a graph and calling tsort. The result of tsort depends on the Ord order of the instance names, which is the creation order.

Somehow, with GHC 9.12.1, the items in qts are being evaluated in a different order, or a different order relative to the evaluation of the order that the different getQInsts calls are being made. For example, the first getQInsts to be evaluated has this graph order:

[((QInst Prelude Prelude.Monad Prelude.ActionValue), []),
 ((QInst Prelude Prelude.Monad Prelude.Maybe), []),
 ((QInst Prelude Prelude.Monad Prelude.Module), []),
 ((QInst Prelude Prelude.Monad Prelude.List), [])]

but the result of tsort is this order:

[(QInst Prelude Prelude.Monad Prelude.Maybe),
 (QInst Prelude Prelude.Monad Prelude.ActionValue),
 (QInst Prelude Prelude.Monad Prelude.List),
 (QInst Prelude Prelude.Monad Prelude.Module)]

and that's presumably because the strings were created in that swapped order.

If I force evaluation of the qts before any calls to getQInsts, then it goes back to being the right order.

The making of the symbol table is a recursive process, so I'm unable to insert traces to get a better idea of why this order is happening. I don't know if it's an issue that should be reported to the GHC team, or maybe it's an intended optimization -- although it's hard to see why.

Also, this alone doesn't seem to be responsible for all of the _tcdict differences. So there must be something else going on.

I'm unsure what's the best way to proceed for the testsuite. I'm not sure how to make the test look for only a portion of the ISyntax dump; I think that comparing the full ISyntax dump is safest, but then we need to enforce a canonical order for the defs in the ISyntax. We could do that by wrapping the tsort nodes in typecheck to give them an Ord order as they appear. I don't know if that is extra overhead -- if it is, we could add a hidden flag for turning it on, to be used by the testsuite.

The 'compare_file' procedure could be updated to check whether the
'expected' argument is a list, but in the meantime, this adds a new
'compare_file_list' procedure that takes a list argument and passes
if any one in the list matches.

This is used in one test where BSC built with GHC 9.12.1 has _tcdict
defs in a different order.  The test now compares against both
possible outputs.
@quark17
Copy link
Collaborator Author

quark17 commented Dec 31, 2024

Offline, we discussed how to update the test to allow for this difference. While we could add a flag to force a canonical order to the defs, we didn't want to add overhead for users and we wanted to avoid a flag that is only used during testing (because then we wouldn't be testing what users see); so we opted to make the test accept multiple legal outputs. We chose to have two expected files, and update the compare function to pass if either one matches. It's possible to have one expected file and just mask out the lines that could be in either order (since they're not crucial to the test), but having two expected files seemed more robust. The test could have selected one expected files based on the GHC version, but trying both files in all cases is again more robust, to future versions of GHC and to changes in flags GHC -- for example, it remains to be seen whether changing the optimization flags to GHC can cause a change in this order.

I have pushed a commit that adds a second expected file and a new compare_file_list procedure.

@quark17 quark17 merged commit a67cac8 into B-Lang-org:main Jan 1, 2025
79 checks passed
@quark17 quark17 deleted the ci-ghc-9.12 branch January 1, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant