Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ramsay t/translation efficiency 2 #6746

Merged
merged 21 commits into from
Mar 4, 2025
Merged

Conversation

ramsay-t
Copy link
Contributor

I broke the other PR :( sorry. This one should be properly signed and mergable?

@ramsay-t ramsay-t added Metatheory No Changelog Required Add this to skip the Changelog Check labels Dec 12, 2024
@ramsay-t ramsay-t self-assigned this Dec 12, 2024
Copy link
Contributor

@ana-pantilie ana-pantilie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramsay-t let me know when this is ready for review again.

@ramsay-t ramsay-t removed the No Changelog Required Add this to skip the Changelog Check label Feb 22, 2025
@ramsay-t
Copy link
Contributor Author

@ana-pantilie @Unisay I've done some fairly horrible things with our golden testing functions but it seems to work :) There are two examples - one currently certifying and one not - with a simple process to add more.

Yes, technically a test of the certifier ought to not invoke the optimiser and should use the AST stacks as input, rather than UPLC and calling the optimiser, but that would be really annoying to serialise and store. I would argue that if we change the optimiser enough to change the code it produces then we probably should re-read the certifier tests at least quickly to make we haven't changed anything the certifier should care about.

Comment on lines 2 to 10
Added
-----

- CI Tests for the certifier, using some simple unoptimised UPLC

Changed
-------

- Complete reordering of decisions in UntypedTranslation.lagda.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we add user-facing changes to the changelog, so I'd remove these two.

data Translation (R : Relation) { X : Set } {{_ : DecEq X}} : (X ⊢) → (X ⊢) → Set₁ where
istranslation : {ast ast' : X ⊢} → R ast ast' → Translation R ast ast'
var : {x : X} → Translation R (` x) (` x) -- We assume we won't want to translate variables individually?
data Translation (R : Relation) { X : Set } {{_ : DecEq X}} : (X ⊢) → (X ⊢) → Set₁
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have no constructors now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, they've just moved to line 68 because it is co-inductive with the TransMatch definition, so the type signature has to be up here.

Comment on lines 60 to 61
[ testGroup "simple certification" $ makeTestTree simpleTests
, testGroup "failing certification" $ makeTestTree failingTests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good that we can run tests on CI now, but an important aspect of testing is communicating intent: why do we expect the len test to fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the comment above the definition on line 48 says: they "fail" in the sense that the certifier says "no" (currently). We will need to look into why that is, but you explicitly asked me to include this so we know the semantics aren't changed (until we do change them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually you want to add tests which actually behave the way you expect them to. So this failing test should be transformed into a succeeding test, and commented out until it's fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "failing" is the wrong name - we want some tests that don't certify. Otherwise we are just demonstrating that the certifier says "yes", and not that it actually checks things. I'm not sure what English word you would prefer to "failing" in that case? Rejected, or Negative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know really we should have tests that we intend to be rejected, but I was hoping to avoid turning this into a solid week of test engineering ;) I shall do that though...

@ramsay-t ramsay-t requested a review from Unisay February 24, 2025 09:29
@Unisay
Copy link
Contributor

Unisay commented Feb 24, 2025

Please re-format the plutus-executables/test/certifier/Spec.hs with fourmolu

@@ -0,0 +1,62 @@
{-# LANGUAGE OverloadedStrings #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pragma is likely not needed as the language extension is already switched on by default in the cabal file.

@@ -0,0 +1,62 @@
{-# LANGUAGE OverloadedStrings #-}

{- | The tests in this file run tests of the uplc certifier. Various unoptimised UPLC is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please lets keep lines within the 100 character width.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...this is 89?

I'm all for the classic 80 character limit if you'd prefer, but then lets set the github commit hook to enforce that.

@ana-pantilie
Copy link
Contributor

Can you add the rest of the tests we were using to check the certifier manually? Like the ones generated by the uplc executable (factorial, fibonacci etc.).

Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me, but to repeat my previous point I currently do not believe this is the right solution long-term despite it making things faster short-term. More on that here.

@ramsay-t
Copy link
Contributor Author

Can you add the rest of the tests we were using to check the certifier manually? Like the ones generated by the uplc executable (factorial, fibonacci etc.).

The problem is they are randomly generated, so we can't do them as "golden" tests. Then we would have to decide what we are testing? Do we assume they all certify? Are we defining that as the last line of the response being "The compilation was successfully certified." - or do we actually set the exit code properly in the executable when certification fails?

@ana-pantilie
Copy link
Contributor

ana-pantilie commented Feb 25, 2025

@ramsay-t

The problem is they are randomly generated, so we can't do them as "golden" tests.

How are they randomly generated? They're just UPLC programs, we can add the UPLC code as you have with the other tests you added here.

Then we would have to decide what we are testing? Do we assume they all certify?

It doesn't make sense to have certifier integration tests which are expected to fail. That would mean that either the compiler or the certifier is broken in some way.

Are we defining that as the last line of the response being "The compilation was successfully certified." - or do we actually set the exit code properly in the executable when certification fails?

I think just checking the text is fine for now, we can improve that in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@effectfully this appears to have changed nothing in the CI :( perhaps you meant something different?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to import it, like it's done in this file with os-support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that fails fast at least :)

Comment on lines 84 to 89
-- These run but the certifier says "no"
rejectedTests :: [String]
rejectedTests =
[

]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to keep this. It's dead code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it isn't its called on line 104 😸

srcTests :: [ String ]
srcTests =
[ "inc"
-- , len
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least add a comment describing why this test is not enabled.


srcTests :: [ String ]
srcTests =
[ "inc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add the other tests as well as I asked? That should be very quick. I would normally be ok with you adding these in a separate PR, but since you decided to combine the testing framework with changes to the decision procedures as well, I will have to insist that you show that the changes to the decision procedures do not break any of the tests we have been running manually until now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are in the makeExample list....? Since they are generated from the executable it made more sense to let it generate fresh versions each time and certify those. We could include hardcoded instances or each of them, but the fresh ones are maybe more interesting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that, sorry. Thanks for making the changes!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except: why do these never trigger the bug that the short “len” example triggers? They are probably already optimised and so are just returning an ID change all the time…. If we can make them unoptimised then they will be more useful but will currently fail (possibly randomly?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that the uplc examples are unoptimised. If I printed the factorial or whatever script, I remember seeing force ... delays that could definitely be reduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the generated certificates, there does seem to be some changes happening, so these are meaningful tests - although they might be more meaningful if we used the plc examples and then fed those in so there was more work happening...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we were typing at the same time :) These are fine for now, although I could spend ages improving them!

@ramsay-t ramsay-t requested a review from Unisay February 27, 2025 12:16
@ramsay-t ramsay-t merged commit 87c89d6 into master Mar 4, 2025
8 checks passed
@ramsay-t ramsay-t deleted the ramsay-t/translation-efficiency-2 branch March 4, 2025 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants