Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bubblegum Update Metadata Version 2 #135

Closed

Conversation

danenbm
Copy link
Contributor

@danenbm danenbm commented Oct 23, 2023

Due to some issues with this approach, most likely we are now going with Version 1

Summary

This PR uses a global update_metadata_seq number for when metadata is updated by either mint_v1 or update_metadata, and uses a download_metadata_seq for the background task that prevents older-downloaded metadata from overwriting newer-downloaded metadata. I pulled out common code from mint_v1 and update_metadata and moved it to db.rs.

The slot_updated checks were removed from most of the Bubblegum program transformers. However, each Bubblegum program transformer makes a call to asset_should_be_updated, which checks whether the asset is being updated by Token Metadata account-based program transformers, or if the global update_metadata_seq is higher than the current sequence number, indicating that an update_metadata occurred later than the current instruction.

Comparison of Version 1 and Version 2

With this Version 2 PR, both mint_v1 and update_metadata update everything. So it uses fewer sequence numbers than Version 1 for protecting the different fields that can be changed by both mint_v1 or update_metadata.

It is more future-proof than Version 1, because if we added more fields that could be changed by update_metadata, they are already protected by the Version 2 strategy and thus would not require any indexing changes.

Numerical comparison of Version 1 and Version 2

Version 1 Version 2
New seq numbers to check/update 4 2
New queries to asset to check before updating 1-2 depending on ix 1
Total upserts in mintV1 (today main has 10) 12 11
Total upserts in update_metadata 6 11

Note that #142 uses a transaction to combine upserts for the asset table.

Sea-ORM autogeneration

As mentioned offline, there were some slight discrepancies with the existing generated Sea-ORM code and the migrations, so I updated the generated code to match the migrations.

Testing

New instruction

  • Ran a createTree, mintV1, verifyCreator, unverifyCreator, and updateMetadata, and observed after the updateMetadata that the name, symbol, and URI had been changed.
  • Deleted the database, then ran the exact same 5 transactions in reverse and observed exact same final result in the database.
  • Again, ran a createTree, mintV1, verifyCreator, unverifyCreator, and updateMetadata, but this time one by one, and checked detailed database state after each instruction.
    • Verified the second creator was set to verified after verifyCreator, which matched the creator passed to the instruction.
    • Verified the correct creator was set to unverified after unverifyCreator, which matched the creator passed to the instruction.
    • Verified that the name, symbol, URI, and creator array changed after updateMetadata, to the values specified in the updateArgs sent into the instruction.

Regression testing

Note did this testing with code in #142 as it stacks on top of this PR.

  • Tested following instruction sequences from Bubblegum and verified database was in correct state afterwards:
    • create_tree, mint_v1, transfer, burn
    • create_tree, mint_v1, redeem, decompress_v1
    • create_tree, mint_v1, redeem, cancel_redeem, redeem, decompress_v1
    • create_tree, mint_v1, transfer, transfer
    • create_tree, mint_v1, delegate, transfer
    • create_tree, mint_v1, verify_creator
    • create_tree, mint_v1, verify_collection
    • create_tree, mint_v1, verify_collection, unverify_collection
    • create_tree, mint_v1, set_and_verify_collection
    • create_tree, mint_to_collection_v1, unverify_collection
  • Deleted and rebuilt database, ran the instruction sequences from Bubblegum in reverse order (i.e.burn, transfer, mint_v1, create_tree, ) and verified database ended up in same state as when they were run forwards.
  • Also ran one of the sequences in a random order and duplicating transactions, and still got expected results.

@danenbm danenbm changed the title Danenbm/update metadata parsing 2 Bubblegum Update Metadata Version 2 Oct 23, 2023
@danenbm danenbm marked this pull request as ready for review October 23, 2023 11:09
Copy link

@blockiosaurus blockiosaurus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good based on my very little understanding of DAS

@danenbm danenbm force-pushed the danenbm/update-metadata-parsing-2 branch from bc7c374 to d817795 Compare November 27, 2023 21:17
DatabaseBackend::Postgres,
"
ALTER TABLE asset_creators
RENAME COLUMN seq to verified_seq;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered what happens when you rename a column for a table with over 275M rows? That's the size of the mainnet creators table. I've never renamed a SQL column but even adding indexes needs to be done carefully at this point. Re-naming a column might not be feasible in production without spinning up a second table and copying the data – which is no fun.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think renaming a column has anything to do with the size of the table? In postgres the names of columns are stored in the catalog, and this change would just be a single row update in the catalog, it shouldn't require a whole table rewrite.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make sure to do this update without having to update all the readers at the same time you could create a view that has the old name, but since this change only impacts ingesters I think you can just shut down ingesters, run migrations, start up again,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is changing the name OK then, given the way you all run migrations?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linuskendall is right, it should be fine

// Upsert into `asset` table.

// Set base mint info.
let tree_id = bundle.keys.get(5).unwrap().0.to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value should be parsed from blockbuster instead of via an array index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@danenbm danenbm Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIll incorporate in #134 after the above is merged

.await
.map_err(|db_err| IngesterError::AssetIndexError(db_err.to_string()))?;
// Set base mint info.
let tree_id = bundle.keys.get(3).unwrap().0.to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this come from blockbuster instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I can do that with this and the other similar case you pointed out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will incorporate in #134 after the above is merged.

.to_owned(),
)
.build(DbBackend::Postgres);
query.sql = format!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the method called "unprotected" if this case exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misnomer. I guess it should have been called "partially protected" or something. The protection on the verified flag was needed for verify/unverify. But the positions, share amounts, etc. would be blanket protected by update_metadata_seq.

where
T: ConnectionTrait + TransactionTrait,
{
if let Some(asset) = asset::Entity::find_by_id(id).one(txn).await? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this approach works. It can lead to race conditions.

Say we go to seq 1 -> 2 with our update

  1. Fetch asset, seq says 1
  2. Determine it can be updated because 2 > 1
  3. Concurrent process upserts new data with seq 3
  4. We then revert from seq 3 -> 2

This function can act as a filter, but not a true safety mechanism. It should be clearly labelled as such. Therefore no method this is called should be "unprotected".

To give an example as a filter:
If current seq = 5 and we want to update to 4, then we know we can eagerly discord the txn.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is possible to do since postgres traansactions can be acid, but it needs some changes to the handling. My initial thoughts would be that we have to do:

  1. Need to use transactions (this is already there in a separate pr)
  2. Need to use SELECT FOR UPDATE
  3. Need serializable transaction isiolation level (I'd have to think, maybe we don't need quite as strict ?)

This would ensure that step no 4 would fail due to that a concurrent process has updated a field inin a SELECT FOR UPDATE clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the current approach is unsound with concurrent ingestor processes. I forgot about this, as my docker testing only runs one ingester. I felt there was something too easy about Version 2, but couldn't quite see it as I tested and thought about it, which is why I had kept Version 1 around.

The choices are:

  1. Either go back to more of the Version 1 approach, or
  2. look into furthering the use of Postgres transactions to resolve it.

After offline discussion with @linuskendall and @NicolasPennie we are planning on doing something more like Version 1, but trying to incorporate transactions to combine some of the upserts in the style of #142. I just have to make sure I understand/implement the error handling correctly based on the comments from @linuskendall.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went back to Version 1 approach: #134 and incorporated all learnings from Version 2 PR.

Copy link
Collaborator

@linuskendall linuskendall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it looks ok but Nick's concern about the serialisability of the changes rquires addressing. This can be addressed via DB transaction and setting the appropraite isolation levels.

pub struct Model {
pub id: i64,
pub asset_id: Vec<u8>,
pub group_key: String,
pub group_value: Option<String>,
pub seq: Option<i64>,
pub slot_updated: Option<i64>,
pub verified: Option<bool>,
pub verified: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a migration to set the default value for thid? since this used to be optional we might have null values in. db

Copy link
Contributor Author

@danenbm danenbm Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at m20230720_120101_add_asset_grouping_verified.rs it was not nullable from that point. All I did for this was regenerate the SeaORM objects based on the existing migrations. Do we need a special migration to check whether people have it as Optional in their databases? My guess is that it wasn't ever nullable and the SeaORM was somehow not updated?

@@ -22,7 +22,7 @@ pub struct Model {
pub seq: i64,
pub level: i64,
pub hash: Vec<u8>,
pub created_at: Option<DateTime>,
pub created_at: DateTime,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above , does this need a migration since it has changed from option

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I have same response. If you look at m20230919_072154_cl_audits.rs its not nullable in the original migration that added the table.

DatabaseBackend::Postgres,
"
ALTER TABLE asset_creators
RENAME COLUMN seq to verified_seq;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think renaming a column has anything to do with the size of the table? In postgres the names of columns are stored in the catalog, and this change would just be a single row update in the catalog, it shouldn't require a whole table rewrite.

DatabaseBackend::Postgres,
"
ALTER TABLE asset_creators
RENAME COLUMN seq to verified_seq;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make sure to do this update without having to update all the readers at the same time you could create a view that has the old name, but since this change only impacts ingesters I think you can just shut down ingesters, run migrations, start up again,

where
T: ConnectionTrait + TransactionTrait,
{
if let Some(asset) = asset::Entity::find_by_id(id).one(txn).await? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is possible to do since postgres traansactions can be acid, but it needs some changes to the handling. My initial thoughts would be that we have to do:

  1. Need to use transactions (this is already there in a separate pr)
  2. Need to use SELECT FOR UPDATE
  3. Need serializable transaction isiolation level (I'd have to think, maybe we don't need quite as strict ?)

This would ensure that step no 4 would fail due to that a concurrent process has updated a field inin a SELECT FOR UPDATE clause.

@@ -110,24 +111,29 @@ impl BgTask for DownloadMetadataTask {
id: Unchanged(download_metadata.asset_data_id.clone()),
metadata: Set(body),
reindex: Set(Some(false)),
download_metadata_seq: Set(Some(download_metadata.seq)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this addition. I can see that the download_metadata uri should be protected by a seq since that is on chain. But the offchain metadata can change at any time. To me the system should always download the latest metadata that is offchain irrespectively of the seq?

So as long as we know that the URI in the asset_data table is only ever updated to the latest we shouldn't protect the download metadata itself?

If the URI is updated, then indeed we must redownload metadata but I also don't see a harm in scheduling additional metadata downloads in between. Not having an additional sequence number to check might be quite good also.

Adding this sequence number has limited value and causes extra DB queries/complexity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to limit the number of downloads, we could either

a) keep track of URI changes and only trigger a redownload if the URI hasn't been downloaded yet
b) make the download metadata task smarter to understand if the offchain data has changed (i.e. do some kind of HEAD check to find the HTTP headers to show if data is changed)

Copy link
Contributor Author

@danenbm danenbm Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the download_metadata uri should be protected by a seq since that is on chain. But the offchain metadata can change at any time. To me the system should always download the latest metadata that is offchain irrespectively of the seq?

Even though the URI in the asset_data table is protected there and thus only ever updated to the latest URI, there is not an easy way to cancel all existing tasks for an asset, so I think an older task could overwrite newer downloaded data.

Example:

  1. mint_v1, has uri1, seq 1, kicks off task(uri1, seq=1).
  2. update_metadata, has uri2, seq2, kicks of task(uri2, seq=2)
  3. Task2 finishes and sets asset_data.metadata to the data from uri2.
  4. Task1 finishes and incorrectly sets asset_data.metadata to the data from uri1.

However, with using a download_metadata_seq, step 4 would be different. Task1 would finish and NOT update asset_data.metadata to the data from uri1.

@danenbm
Copy link
Contributor Author

danenbm commented Dec 4, 2023

Due to some issues with this approach, most likely we are now going with Version 1

@danenbm
Copy link
Contributor Author

danenbm commented Dec 5, 2023

Version 2 asset table checks can cause a race condition with concurrent ingestor processes. Closing this PR in favor of Version 1

@danenbm danenbm closed this Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants