-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it a lot, really clean 🌟 Thanks so much for all the thought you're putting into this.
One comment about the name of the jinja2 filter (cool feature, btw). Feel free to either change it and merge, or simply merge as-is. Thanks!
Also like the idea of applying black formatting on the output.
I think the output format of the generated test could be made more human readable. Which is why the PR is still in draft. For example, we should ensure addresses are represented as hex and large numbers like v,r,s to be perhaps literal hex numbers. So that it is consistent with manually written tests. How about:
I think the best way to achieve this is to override class HexNumber(Number):
"""
Class that helps represent an hexadecimal numbers in tests.
"""
+ def __repr__(self):
+ return f"0x{123}"
def __str__(self) -> str:
"""
Returns the string representation of the number.
"""
return self.hex()
I would like you to weigh in on the effect of this across the codebase. |
I'll set off with a broader topic, so we have the full picture. Bear with me! All right, so getting nice representations would also really help us with the Test Case Reference docs, see the bytes objects in the test case parameters tables, for example here: https://ethereum.github.io/execution-spec-tests/main/tests/prague/eip6110_deposits/test_deposits/test_deposit.html For the objects defined and only used in the test cases (which aren't part of EEST's types libraries), adding a For the objects that we do define in our test libraries, we get nice representations (of most objects, at least) when we write them as string to the fixture JSON files. And, instead of just adding Get some fixtures: uv run fill tests/istanbul/ Then play around in uvx --with-editable . ipython for example: import rich
from ethereum_test_fixtures.file import Fixtures
from ethereum_test_base_types import to_json
fixtures = Fixtures.from_file("fixtures/state_tests/istanbul/eip1344_chainid/chainid/chainid.json", fixture_format=None)
fixtures.keys()
fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Istanbul-state_test]']
fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Istanbul-state_test]'].env
rich.print(fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Shanghai-state_test]'].env)
to_json(fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Shanghai-state_test]'].env.fee_recipient)
execution-spec-tests/src/ethereum_test_base_types/json.py Lines 10 to 21 in 7758f37
Just be aware that fill serializes from the test spec format (e.g. I can take a deeper look myself and happy to discuss to brainstorm, but if you have time and you're interested, have play around! |
I get your point. The However, Let me know if my understanding is correct. Meanwhile, let me think this through. |
Yup, that's right. Yes, unfortunately pydantic doesn't deserialize to Python. Calling Perhaps @marioevz can chime in, if he has a better idea? 🙏 Sorry for the detour and that I couldn't give you a better answer earlier @raxhvl. |
I have been marinating on this. I believe your intuition is correct: We should implement a unified serialization pipeline. This approach not only eliminates duplicate (tedious) efforts but also ensures that the framework produces consistent data outputs, whether for fixtures or gentests. Fundamentally, gentest should be as close to manual tests as possible. Our goal is to serialize a Pydantic model to python code. After trying out few things, I have a solution. We can override Here is a full example: from pydantic import BaseModel, Field
from typing import List
class RecipeBaseModel(BaseModel):
def __repr_args__(self):
# Skip tags
attrs_names = self.model_dump(mode="json", exclude={"tags"}).keys()
attrs = ((s, getattr(self, s)) for s in attrs_names)
return [(a, v) for a, v in attrs if v is not None]
# Example child model
class Ingredient(RecipeBaseModel):
name: str
quantity: str
tags: List[str]
# Example nested child model
class Recipe(RecipeBaseModel):
title: str
ingredients: List[Ingredient]
# Example usage
ingredient1 = Ingredient(name="Tomato", quantity="2 cups", tags=["vegetable", "fresh"])
ingredient2 = Ingredient(name="Basil", quantity="1 bunch", tags=["herb", "fresh"])
recipe = Recipe(title="Tomato Basil Salad", ingredients=[ingredient1, ingredient2])
print(repr(ingredient1)) # Ingredient(name='Tomato', quantity='2 cups')
print(
repr(recipe)
) # Recipe(title='Tomato Basil Salad', ingredients=[Ingredient(name='Tomato', quantity='2 cups'), Ingredient(name='Basil', quantity='1 bunch')]) I know this will take some effort to implement correctly. But I think incentive to unify serialization will go down as the code diverges because of the refactoring involved. Let me know if we should pursue this or override |
Thanks @raxhvl for taking the time to look so deeply into this! This looks like a great approach to unify our serialization and think we can start adding If I understand correctly, the code paths for serializing the JSON fixture files would remain unchanged? But now we use |
As we discussed, whenever you have time, feel free to just go ahead and add either:
As we discussed, doing 1. would be cleaner and probably easier, but 2.'s fine if you'd like to just get a proof of concept. :) Thanks again for looking into this. Another advantage of unifying pydantic with this serialization to Python is that all the fields that are |
The new serialization works wellI have added a new base class, here is how it fits in: Warning This is outdated. See updated implementation. classDiagram
BaseModel <|-- EthereumTestBaseModel
EthereumTestBaseModel <|-- CopyValidateModel
CopyValidateModel <|-- CamelModel
class EthereumTestBaseModel{
* serialize()
* __repr_args__()
}
The I tried to fill the newly generated test and it succeeds. uv run gentest 0xa41f343be7a150b740e5c939fa4d89f3a2850dbe21715df96b612fc20d1906be tests/paris/test_0xa41f.py
uv run fill tests/paris/test_0xa41f.py
=== 6 passed, 6 warnings in 5.21s === Should we update
|
return input.model_dump(mode="json", by_alias=True, exclude_none=True) |
We could replace this with input.serialize()
so that serialization is at one place but I'm not sure if all classes does infact inherrit EthereumTestBaseModel
. So the code is now duplicated at two places.
We should NOT modify pre-state in the tests
This seems quite unusual to me. Ideally, the pre-state tracer should provide the state for all accounts prior to the transaction's execution.
execution-spec-tests/src/cli/gentest/test_providers.py
Lines 50 to 52 in cfbfea9
if address == self.transaction.sender: | |
state_str += f"{pad}nonce={self.transaction.nonce},\n" | |
else: |
While the sender's nonce does get updated, that's just one of several side effects resulting from the transaction. Changing the state within the test interferes with the EVM - a big red flag 🟥 . We shouldn't be updating the state ourselves; that's the EVM's responsibility. The test should focus solely on comparing the state transition against the specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @raxhvl, this is looking very clean!
Can you rebase on main, this will get some important CI fixes, that will help getting this merged more quickly :)
Few comments:
- Yes, I think we should replace
model_dump
into_json
. You can verify the fixtures locally to ensure that this doesn't break anything by using thehasher
cli. Fill the tests twice: Once withmain
and once withraxhvl:feat/gentest
to separate, clean directories using--output=fixtures-raxhvl
, for example. Use-n auto
to enable parallelism when filling. Then:
uv run hasher fixtures-main/ > hasher-main.txt
uv run hasher fixtures-raxhv/l > hasher-raxhvl.txt
diff fixtures-main.txt hasher-raxhvl.txt
- Completely agree that we shouldn't modify any pre-state. I'm not sure why this was originally added.
- Side note: Formatting the generated code using black is a great idea, just be aware that
black
will get switched out toruff
soonish, see feat(all): addruff
as default linter and additional clean up #922. So don't invest too much more time withblack
- switching toruff
as it stands in this PR should be very easy.
I'd love to get @marioevz's feedback on this. If you'd like to do point 1. up front and check the fixtures, go for it and request a review from him please. If you'd like to get feedback first, we can request one now 🙂
The code uses cli to format code as long as It would be nice if Mario reviews this once we are done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work so far, thanks!
Just a single comment (the other comment about the fork name is not really something that we need to do right now).
All models are now using the same serialization logic. We have two types of custom models: one for classDiagram
BaseModel <|-- EthereumTestBaseModel
RootModel <|-- EthereumTestRootModel
ModelCustomizationsMixin <|-- EthereumTestBaseModel
ModelCustomizationsMixin <|-- EthereumTestRootModel
class ModelCustomizationsMixin{
serialize()
__repr_args__()
}
I have tested this thoroughly, but please review it with care as this change impacts EVERY Pydantic model in the codebase. I'm especially unsure of EthereumTestRootModel override. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really impressive, great addition! Thanks again @raxhvl for taking the time to find such an elegant solution for this!
Hey @raxhvl, in regards to this, it looks good to me, I've asked @marioevz if he'd also like to take a look. I diff'd the fixtures locally and got an exact match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for putting so much thought and care into this!
🗒️ Description
This PR ensures data from gentest to generated tests reaches in a type-safe manner.
The Gentest module generates a test script in three steps:
The process of generating context and rendering templates will share the same logic across different kinds of tests. These are created as modules: (
cli
,test_context_provider
, andsource_code_generator
).I am using the same data type when passing data from the provider to the generated source code. This keeps the behavior predictable; Pydantic will raise errors if the RPC call returns unexpected values even before we run the test. I'm leveraging
__repr__
values to print the data into the source code.Finally, the code is formatted using the same formatter (Black) used elsewhere. The existing formatting configurations are applied, so any change in configuration should automatically reflect in the files generated as well. Formatting would become essential when reading contract bytecode, especially with the plans to convert bytecode to python objects (see #862).
This effort makes writing new kinds of gentests easier, requiring only the creation of new test providers and templates.
🔗 Related Issues
closes #859
✅ Checklist
mkdocs serve
locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.