-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
abi analyzer #540
abi analyzer #540
Conversation
👍 |
ac4641a
to
d113244
Compare
d113244
to
99117df
Compare
b4350c5
to
62a0a5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good, only one substantial comment around error handling.
Sorry for the late reply; I did do an early pass of the PR, but I see now that it remained unsubmitted. It doesn't change anything though.
-- For evm.Call transactions, we store both the name of the function and | ||
-- the function paramters. | ||
ADD COLUMN evm_fn_name TEXT, | ||
ADD COLUMN evm_fn_params JSONB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With events, there was some ugliness around anonymous fields. Fns can have anonymous params, and errors can presumably have anonymous fields (I'm not 100% sure, not checking right now). Can JSONB handle those or should this and error_params
be a list of (k, v) tuples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha, looks like this JSONB will always be a list, and the items of the list will be just the values v of the (k, v) tuples I was talking about. So to fully render/understand the params, you need to JOIN the abi. I think that's fair. Can you add a little comment to document the structure of this JSON please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit*: I misread which line this comment was on - yes, for transactions it's a json list of values only. Will add the comment
-- original comment (about how event args are processed)
Hm actually the JSONB will have a list of (k,v) tuples, where k is the param name and v is the param value. It shouldn't need to be JOINed with the abi since we do that during event parsing already.
From my understanding the mapping of event param names to values should be 1:1. Warren's ParseEvent extracts the input values whether they're stored in the topics or in the event data, which should be exhaustive. We fetch param names from the Event Inputs, which should also be exhaustive since go eth will autogenerate names for unnamed args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, after seeing the evm_fn_params
for some nftrout calls, it's much easier to interpret as a list of structured args similar to what we do for events.
[{"name": "_taskIds", "value": [55, 56, 57, 58, 59, 60, 61]
, "evm_type": "uint256[]"}, {"name": "_proof", "value": "", "evm_type": "bytes"}, {"name": "_report", "value": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAABoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAMgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpZ3gyZXA1Mm43NHV5dXhxeTR1ZjVvNHl1NGI3ZmRzcHc1d2wzdmx4ZG1mdm1nZWRjeTdqaQAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpY2l5cHVocDdneTJoeXB3bGJmbTd3Z2VsNGdoZmpyb2ZxeGFxem9lYWF4cGd3aW4zNG03ZQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAADtiYWZ5cmVpY29jdnUzaTVkMnVpcjJta3pjaXRvZmlvdjYzd3Bja3hnZ2pzdXUyZWRjdmZsdnJub2J2cQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpZGZ3Z3FzNHhnNDRr
cjJqamV0a2w2aDQ1ZHJtcm41M3ZpYzRzZDZtdWxlcGtkenpvdmJ4bQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpYWk3cHNqdTZ5NWxhZnJtb3Fzb25jZDQ2dzZ5a2ZqenZ1cG
dsNG9penBhNG5uc241aTI2dQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpaGJkazM2cmQ0emx6b3I0NmJ1Y3JxZXpnbGZsY3piaTIyaXBheWd5NXZuaTZtcXVxZncyNAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADtiYWZ5cmVpYmM1dGxzM2R6cWprM2RiN3lsbmZ6bzNldHY3d241c2NsMmM3NXlpc2xtYnZxNGZ3dDI0aQAAAAAA", "evm_type": "bytes"}]
62a0a5c
to
45fad28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I realized there's another gotcha around events that might in require a little more performance testing/tweaking. Sorry we didn't think of it before. But in general, getting there :)
analyzer/queries/queries.go
Outdated
FROM abi_contracts | ||
JOIN chain.runtime_transactions as txs ON | ||
txs.runtime = abi_contracts.runtime AND | ||
txs.to = abi_contracts.addr AND | ||
txs.method = 'evm.Call' | ||
JOIN chain.runtime_events as evs ON | ||
evs.runtime = abi_contracts.runtime AND | ||
evs.tx_hash = txs.tx_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on today's slack conversation with Warren, we actually want to avoid joining on the txs because the contract that was the target of tx is not necessarily also the contract that emitted the event. The contract that emitted the event is in the event's body.address
. So something like
FROM abi_contracts | |
JOIN chain.runtime_transactions as txs ON | |
txs.runtime = abi_contracts.runtime AND | |
txs.to = abi_contracts.addr AND | |
txs.method = 'evm.Call' | |
JOIN chain.runtime_events as evs ON | |
evs.runtime = abi_contracts.runtime AND | |
evs.tx_hash = txs.tx_hash | |
FROM abi_contracts | |
JOIN chain.runtime_events as evs ON | |
evs.type = 'evm.log' AND | |
evs.runtime = abi_contracts.runtime AND | |
decode(evs.body->>'address', 'base64') = eth_preimage(abi_contracts.contract_address) |
That might require an index on body.address (IDK if it's needed, and/or if it would help), or maybe even a generated column for decode(evs.body->>'address', 'base64')
and then an index on top of that. See how it goes with the simplest approach first :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm it's slow, even with a generated column. Seems like the eth_preimage
fn is the bottleneck - will try to just join over address_preimages instead
oasisindexer=*> explain analyze
WITH abi_contracts AS (
SELECT
runtime,
contract_address AS addr,
abi
FROM chain.evm_contracts
WHERE
runtime = 'emerald' AND
abi IS NOT NULL
)
SELECT
abi_contracts.addr,
abi_contracts.abi,
evs.round,
evs.tx_index,
evs.body
FROM abi_contracts
JOIN chain.runtime_events as evs ON
evs.type = 'evm.log' AND
evs.runtime = abi_contracts.runtime AND
evm_contract_address_eth = eth_preimage(abi_contracts.addr)
LIMIT 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..181689.03 rows=1000 width=982) (actual time=42717.383..61984.504 rows=197 loops=1)
-> Nested Loop (cost=0.00..2232049.74 rows=12285 width=982) (actual time=42717.381..61984.447 rows=197 loops=1)
Join Filter: (evs.evm_contract_address_eth = eth_preimage(evm_contracts.contract_address))
Rows Removed by Join Filter: 326507
-> Seq Scan on runtime_events evs (cost=0.00..1580270.46 rows=614246 width=215) (actual time=15.804..55033.975 rows=46672 loops=1)
Filter: ((runtime = 'emerald'::runtime) AND (type = 'evm.log'::text))
Rows Removed by Filter: 20543893
-> Materialize (cost=0.00..678.53 rows=4 width=807) (actual time=0.000..0.001 rows=7 loops=46672)
-> Seq Scan on evm_contracts (cost=0.00..678.51 rows=4 width=807) (actual time=1.088..2.911 rows=7 loops=1)
Filter: ((abi IS NOT NULL) AND (runtime = 'emerald'::runtime))
Rows Removed by Filter: 7988
Planning Time: 3.994 ms
Execution Time: 61984.703 ms
(13 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 is simple to use but not very efficient. That said, I thought it wouldn't matter here because we have so few verified contracts. But the query plan doesn't choose to go contract by contract, trying to find matching events for each contract :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to join on chain.address_preimages
. Also added an index on chain.runtime_events (runtime, type)
, which makes the query performant enough. (note that the query below is for 1000; in practice we'll use a limit of <100))
Edit*: With a limit of 100 it takes ~80ms, though some of the benefit may be due to caching
oasisindexer=*> explain analyze
WITH abi_contracts AS (
SELECT
runtime,
contract_address AS addr,
abi
FROM chain.evm_contracts
WHERE
runtime = 'emerald' AND
abi IS NOT NULL
)
SELECT
abi_contracts.addr,
abi_contracts.abi,
evs.round,
evs.tx_index,
evs.body
FROM abi_contracts
join chain.address_preimages as preimages on abi_contracts.addr = preimages.address JOIN chain.runtime_events as evs ON
evs.type = 'evm.log' AND
evs.runtime = abi_contracts.runtime AND
decode(body->>'address', 'base64') = preimages.address_data
LIMIT 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10154.37..1170949.61 rows=114 width=982) (actual time=257.164..2266.097 rows=197 loops=1)
-> Gather (cost=10154.37..1170949.61 rows=114 width=982) (actual time=257.162..2266.052 rows=197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=9154.37..1169938.21 rows=48 width=982) (actual time=1015.169..2243.789 rows=66 loops=3)
Hash Cond: (decode((evs.body ->> 'address'::text), 'base64'::text) = preimages.address_data)
-> Parallel Bitmap Heap Scan on runtime_events evs (cost=8442.09..1167623.43 rows=256322 width=183) (actual time=5.136..2204.427 rows=15564 loops=3)
Recheck Cond: ((runtime = 'emerald'::runtime) AND (type = 'evm.log'::text))
Heap Blocks: exact=7413
-> Bitmap Index Scan on ix_runtime_events_type (cost=0.00..8288.29 rows=615173 width=0) (actual time=9.769..9.769 rows=46692 loops=1)
Index Cond: ((runtime = 'emerald'::runtime) AND (type = 'evm.log'::text))
-> Hash (cost=712.23..712.23 rows=4 width=828) (actual time=5.272..5.274 rows=7 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 14kB
-> Nested Loop (cost=0.41..712.23 rows=4 width=828) (actual time=1.554..5.236 rows=7 loops=3)
-> Seq Scan on evm_contracts (cost=0.00..678.51 rows=4 width=807) (actual time=1.501..5.043 rows=7 loops=3)
Filter: ((abi IS NOT NULL) AND (runtime = 'emerald'::runtime))
Rows Removed by Filter: 7988
-> Index Scan using address_preimages_pkey on address_preimages preimages (cost=0.41..8.43 rows=1 width=68) (actual time=0.024..0.024 rows=1 loops=21)
Index Cond: ((address)::text = (evm_contracts.contract_address)::text)
Planning Time: 1.216 ms
Execution Time: 2266.211 ms
(21 rows)
Side note: The generated column for the event address ended up being not needed, but will keep in mind for future performance testing.
1739132
to
d28ba58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
It's unfortunate that our e2e (or other) tests don't currently cover sourcify, and so cannot test this. I assume you tested locally. Luck favors the brave 🙄
d28ba58
to
a116f77
Compare
update db schema initial version of abi analyzer don't parse result fix import cycle proper argument unwrapping rename migration nit update db schema address comments address more comments nit address comments address comments add startup logic testing fixes rename migration nits nits enable emerald abi in regression tests (no-op)
c6824be
to
c8ce02e
Compare
Part of #872
Adds an analyzer to re-parse event body, tx data, and tx revert reason using known abis.
Note: does not require a reindex