Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLV JS: Introduce JS library for TLV #5184

Merged
merged 7 commits into from
Sep 13, 2023

Conversation

wjthieme
Copy link
Contributor

@wjthieme wjthieme commented Sep 1, 2023

resolves #5127

In this PR I have isolated parsing of TLVData into its own file. For ticket #5127, I think this is fine for now as this allows us to very quickly extract TLVData parsing into its own library if we ever need it.

The TLVData class adds two helper functions for reading TLVData.

  • entry(type) returns the first entry where the type matches the supplied type (preferred)
  • entries() returns a map of all entries in the TLVData.

Parsing still succeeds if the tlv data is not the right length. entry just returns null for a partial last entry and entries just excludes the last partial entry.

The TLVData class allows you to specify the length/span of both the type and length part of the tlv which should make parsing tlv data much easier in the future.

I added a couple simple tests for parsing and reading TLVData.

This new class is not actually being used yet anywhere in the code. I plan to create (at least) two follow up PRs that switch out the old implementation for this new one once this PR is merged:

  • getExtensionData in ./token/js/extension/extensionType.ts
  • getExtraAccountMetaAccount in ./token/js/exension/transferHook/state.ts
  • Anything else?

@mergify mergify bot added the community Community contribution label Sep 1, 2023
@wjthieme wjthieme changed the title Added a TLVData parsing helper class token-js: added a helper class for parsing tlv-data Sep 1, 2023
@buffalojoec buffalojoec self-requested a review September 7, 2023 00:59
Copy link
Contributor

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely heading in the right direction. Thanks again for continuing to contribute here!

I will say I think it's worth matching a little bit more of the behavioral aspects of the Rust libraries, namely:

Parsing still succeeds if the tlv data is not the right length. entry just returns null for a partial last entry and entries just excludes the last partial entry.

In Rust, this would fail to parse, so we probably want this to happen in JS as well. We might even need to introduce some errors similar to the Rust ones.

if tlv_data.len() < value_end {
return Err(ProgramError::InvalidAccountData);
}

I also think we should try to use similar nomenclature for the class and method names, to avoid confusion, ie: TlvState, get_first_bytes, etc. Obviously we don't need the separated mut API in here, though.

Comment on lines 3 to 19
function readTLVNumberSize<T>(
buffer: Buffer,
size: TLVNumberSize,
offset: number,
constructor: (x: number | bigint) => T
): T {
switch (size) {
case 1:
return constructor(buffer.readUInt8(offset));
case 2:
return constructor(buffer.readUInt16LE(offset));
case 4:
return constructor(buffer.readUInt32LE(offset));
case 8:
return constructor(buffer.readBigUInt64LE(offset));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat function here, but our TLV library is using u32 for length and Token2022 is using u16, so we can probably do away with handling u64 and eliminate the need for bigint. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the discriminator could still by 8 bytes, right? This function is currently used for both parsing the discriminator and the length

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but it's not a u64 it's just an array, so it doesn't need to be represented as bigint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that makes sense. This was only added because you can then supply a BigInt or number in the entry function as well as a Buffer.

token/js/src/extensions/tlvData.ts Outdated Show resolved Hide resolved
token/js/test/unit/tlvData.test.ts Outdated Show resolved Hide resolved
@buffalojoec
Copy link
Contributor

I can see that you're going for a more generalized approach here, and I'm not sure this idea I'm going to suggest will be as slick as I imagine it, but what do you think about instead using an interface and/or super class that together mock the TlvState trait? Ideally then you'd just supply the sizes and get all of the TLV stuff out of the box?

That's what a lot of these tests are doing; just creating objects that implement SplDiscriminate and can then be used as TLV entries.

In JS, I understand we would have to somehow replicate Rust's size_of or the solana_program::borsh::get_instance_packed_len functions, but that could be roped in here.

@wjthieme
Copy link
Contributor Author

wjthieme commented Sep 7, 2023

In Rust, this would fail to parse, so we probably want this to happen in JS as well. We might even need to introduce some errors similar to the Rust ones.

Agreed. Let's keep it similar to the rust implementation. Also think it makes sense to keep the naming the same. i.e. TlvState instead of TLVData, get_first_bytes and get_bytes_with_repetition and maybe get_discriminators which would then replace the existing functions.

I can see that you're going for a more generalized approach here, and I'm not sure this idea I'm going to suggest will be as slick as I imagine it, but what do you think about instead using an interface and/or super class that together mock the TlvState trait? Ideally then you'd just supply the sizes and get all of the TLV stuff out of the box?

I think it would definitely make sense to add some form of spl_discriminate to the js library as well. Mainly because it makes it easier to read and verify the code. i.e. you can just match the discriminator string to the rust code instead of having the take bytes/bigint/number and turning that into raw bytes and then trying to find the matching discriminator.

I think probably the easiest (and fastest) implementation could be a single function that takes a discriminator string and turns it into bytes/bigint/number which you can then throw into the TlvState to grab the entry. Something like:

export function splDiscriminator(key: string, length = 8): Buffer {
    // TODO: make sure this also works in browser without needing to polyfill node crypto/buffer?
    const digest = createHash('sha256').update(key).digest();
    return digest.subarray(0, length);
}

@buffalojoec
Copy link
Contributor

Ok I think we managed to fix the CI!

Let me know when you're ready for another review

@buffalojoec
Copy link
Contributor

I think it would definitely make sense to add some form of spl_discriminate to the js library as well. Mainly because it makes it easier to read and verify the code. i.e. you can just match the discriminator string to the rust code instead of having the take bytes/bigint/number and turning that into raw bytes and then trying to find the matching discriminator.

I think probably the easiest (and fastest) implementation could be a single function that takes a discriminator string and turns it into bytes/bigint/number which you can then throw into the TlvState to grab the entry. Something like:

export function splDiscriminator(key: string, length = 8): Buffer {
    // TODO: make sure this also works in browser without needing to polyfill node crypto/buffer?
    const digest = createHash('sha256').update(key).digest();
    return digest.subarray(0, length);
}

Yes, I like this approach! I think that's perfect. The more I think about it, the more I think a more generalized approach as you've written makes sense, because as I mentioned above we want to actually not do what we did in Rust for Token2022 and TLV, and shared all TLV operations across this library.

If Token2022 wasn't being audited, I'd vote to do this in Rust as well using the Rust TLV lib, and modify the Rust spl-discriminator library to support whatever length discriminator you want. This may happen in the future, so what you've got for any length discriminator makes sense to me!

@buffalojoec
Copy link
Contributor

buffalojoec commented Sep 12, 2023

Can we roll a separate library for this stuff, rather than working it into Token JS?

I think after some of our discussion it makes sense to introduce new ones, and then they can be imported into the metadata library as well as token (#5228).

I think, for now, let's just keep SplDiscriminate and TLV stuff in one JS library - we'll call it @solana/spl-type-length-value, and then import it where we need it. What do you think?

We could prob add a folder to the type-length-value library for js

@wjthieme
Copy link
Contributor Author

Can we roll a separate library for this stuff, rather than working it into Token JS?

I think that makes sense. What location in the repository would you suggest for that?

There is the libraries folder but it currently only contains rust libraries so wouldn't necessarily say it makes sense to put it there.

I'm leaning towards either a type-length-value or js-type-length-value in the root or something along the lines of libraries-js/type-length-value. What do you think?

@buffalojoec
Copy link
Contributor

There is the libraries folder but it currently only contains rust libraries so wouldn't necessarily say it makes sense to put it there.

You're right but that's only because none of those libraries have any JS counterparts 😂

I'd say just stick a js folder into libraries/type-length-value. Cargo won't even look at it

Copy link
Contributor

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is coming together nicely! I left comments mostly around simplifying the provided discriminator(s). I think if we can strip some of that complexity away, we'll be in good shape.

Thanks again!

.github/workflows/pull-request-libraries.yml Show resolved Hide resolved
libraries/type-length-value/js/package.json Outdated Show resolved Hide resolved
libraries/type-length-value/js/README.md Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/splDiscriminate.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
libraries/type-length-value/js/src/tlvState.ts Outdated Show resolved Hide resolved
@buffalojoec buffalojoec changed the title token-js: added a helper class for parsing tlv-data TLV JS: Introduce JS library for TLV Sep 12, 2023
Copy link
Contributor

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. Thanks for putting this together and addressing all feedback!

@buffalojoec buffalojoec merged commit a399681 into solana-labs:master Sep 13, 2023
8 checks passed
@wjthieme wjthieme deleted the wjthieme-patch-5 branch September 13, 2023 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type-Length-Value JS Lib
2 participants