Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spec] v2.2 - action chaining #18

Merged
merged 11 commits into from
Aug 5, 2024
107 changes: 105 additions & 2 deletions packages/actions-spec/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,16 @@ A `GET` response with an HTTP `OK` JSON response should include a body payload
that follows the interface specification:

```ts filename="ActionGetResponse"
export interface ActionGetResponse {
export type ActionType = "action" | "completed";

export type ActionGetResponse = Action<"action">;

/**
* A single Solana Action
*/
export interface Action<T extends ActionType = "action"> {
/** @default `action` */
type?: T;
nickfrosty marked this conversation as resolved.
Show resolved Hide resolved
/** image url that represents the source of the action request */
icon: string;
/** describes the source of the action request */
Expand All @@ -171,6 +180,7 @@ export interface ActionGetResponse {
label: string;
/** UI state for the button being rendered to the user */
disabled?: boolean;
/** */
nickfrosty marked this conversation as resolved.
Show resolved Hide resolved
links?: {
/** list of related Actions a user could perform */
actions: LinkedAction[];
Expand All @@ -180,6 +190,14 @@ export interface ActionGetResponse {
}
```

- `type` - The type of action being given to the user. Defaults to `action`. The
initial `ActionGetResponse` is required to have a type of `action`.

- `action` - Standard action that will allow the user to interact with any of
the `LinkedActions`
- `completed` - Used to declare the "completed" state within action chaining.
After the

- `icon` - The value must be an absolute HTTP or HTTPS URL of an icon image. The
file must be an SVG, PNG, or WebP image, or the client/wallet must reject it
as **malformed**.
Expand Down Expand Up @@ -441,11 +459,21 @@ A `POST` response with an HTTP `OK` JSON response should include a body payload
of:

```ts filename="ActionPostResponse"
export interface ActionPostResponse {
/**
* Response body payload returned from the Action POST Request
*/
export interface ActionPostResponse<T extends ActionType = ActionType> {
/** base64 encoded serialized transaction */
transaction: string;
/** describes the nature of the transaction */
message?: string;
links?: {
/**
* the next action in a successive chain of actions to be obtained and/or rendered
* to the user after the previous was successful
*/
next: string | NextAction<T | "action">;
nickfrosty marked this conversation as resolved.
Show resolved Hide resolved
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ActionPostResponse should be as follows, allowing for form-like use-cases where tx approval is not required for each action and maybe just the last action in the chain required the approval.

export interface ActionPostResponse<T extends ActionType = ActionType> {
    /**
     * If the current action requires a transaction to be signed then this should be a base64 encoded serialized transaction
     * If the current action does not require a transaction to be signed, then this should be null
     */
  transaction: string | null;
  /** describes the nature of the transaction */
  message?: string;
  links?: {
    /**
     * The next action in a successive chain of actions to be obtained after
     * the previous was successful.
     */
    next: NextActionLink;
  };
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Form-like is needed. From a UX point of view, you don't want to sign more than once. So ideally, at every new screen, the developer gets new information and packages a final instruction for the user to sign. If you want to make users sign every screen, then you can too.

I would even make every chain/screen transaction optional. Maybe you sign on the first screen and the last screen is an optional survey. Maybe there is no transaction at all, simply collecting info, e.g. pubkeys

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's why the transaction can be either string or null, making the transaction optional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a useful thing to support form-like experiences, actions are still currently required to return a transaction, since the user is required to sign a transaction.

Implementing something like this is planned in general, but not within this spec change / proposal. When sign message support is added, maybe in there?

@thearyanag maybe you could post a more complete sRFC with details about a proposal to support some sort of "non signing action" support? My gut is that something like this should not be allowed on the root action, so the user is still required to "initiate a session" within the blink by signing a transaction (or eventually a message), then they could complete form-like experiences

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the user required to sign transaction?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Blink will become inconvenient if you have to sign more than once. Unless it is a feature of that blink to sign multiple times, I believe requiring more than one signature is UX misstep.

Fill form, sign, fill new form, sign again. Devs stuck with partial signs to handle. Find the Blink again, load the partial properly.

At which point you might as well redirect the user to an external site to provide a streamline experience.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you on form based things. I agree that the idea of the user not being always required to sign something in every single action is a good one. It can unlock some interesting new experiences within blinks.

I'm saying it is out of scope for this specific PR/spec change. This spec change has already gone through much discussion on the Solana forum for several weeks now and is effectively finalized.

Hence my original request:

post a more complete sRFC with details about a proposal to support some sort of "non signing action" support

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the upcoming spec proposal for sign message, we are planning to suggest a session token that can be passed back and forth between the action api server and the blink client. Think JWT passed between server and client, but within a blink.

Something like this would also make the idea of non-signing actions even more powerful and useful. The user initiates an authenticated session (by signing a message via their wallet in the first action in a chain), the api server verifies the signature in their backend, the api server provides their token for the client interactions.

Off the top of my head:
Going with a pure "this is a form and does not require the user to sign anything" approach you seem to be suggesting is easily abused since a bot can spam the action api server to submit the form data. But never truly connect or verify a wallet. Ripe for abuse.

Hence my suggestion of:

to be a different sRFC and discussion. It would need additional discussion and thought

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I'm late to the discussion and this is a needed step anyhow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a useful thing to support form-like experiences, actions are still currently required to return a transaction, since the user is required to sign a transaction.

Implementing something like this is planned in general, but not within this spec change / proposal. When sign message support is added, maybe in there?

@thearyanag maybe you could post a more complete sRFC with details about a proposal to support some sort of "non signing action" support? My gut is that something like this should not be allowed on the root action, so the user is still required to "initiate a session" within the blink by signing a transaction (or eventually a message), then they could complete form-like experiences

agreed, I'll post a more detailed sRFC

```

Expand All @@ -459,6 +487,12 @@ export interface ActionPostResponse {
the user. For example, this might be the name of an item being purchased, a
discount applied to a purchase, or a thank you note.

- `links.next` - An optional value use to "chain" multiple Actions together in
series. After the included `transaction` has been confirm on-chain, this
`links.next` can be used to fetch the next action (via a callback url) or
display the provided `NextAction`. See [Action Chaining](#action-chaining) for
more details.

- The client and application should allow additional fields in the request body
and response body, which may be added by future specification updates.

Expand Down Expand Up @@ -498,6 +532,75 @@ must do so only if a signature for the `account` in the request is expected.
If any signature except a signature for the `account` in the request is
expected, the client must reject the transaction as **malicious**.

#### Action Chaining

Solana Actions can be "chained" together in a successive series. After an
Action's transaction is confirmed on-chain, the next action can be obtained and
presented to the user.

Action chaining allows developer to build more complex and dynamic experiences
within blinks, including:

- providing multiple transactions (and eventually sign message) to a user
- customized action metadata based on the user's wallet address
- refreshing the blink metadata after a successful transaction
- receive an API callback with the transaction signature for additional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also proposing to explicitly highlight the behaviour on tx failed during confirmation - my proposal is to always send the tx signature to API regardless of confirmation status to enable basic error handling in all cases

  • tx confirmed
  • tx failed
  • tx timed out

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the transaction failed to confirm, I would assume the action could be called again via the blink UI to attempt to repeat it. Is this not the case?

Having the ability to achieve a callback based on each of these solana transaction lifecycle events could be useful though. But it does feel like scope creep...

  • How do you propose the blink-client denotes each of these lifecycle statuses?
  • What should the UI do after the getting a response from the callback?
  • Should only a single callback url for all these lifecycle events be supported? Or should the developer be able to define specific callback urls for each lifecycle event?

Copy link
Collaborator

@tsmbl tsmbl Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we generally have 2 options

a) As you proposing, assume the action should be called again via the blink UI to attempt to repeat it. I think it will work fine, but can be a bit annoying if you have a long chain and want to repeat only last action.
b) If next action with POST type is defined, always pass the tx signature to backend regardless of confirmation result - backend can check signature status and make a decision about the next step, e.g. retrying last action, rather than to repeat the entire chain from the beginning

So, blink client in (b) just needs to always send tx signature to callback, backend then will return the next state that should be shown to user

How do you propose the blink-client denotes each of these lifecycle statuses?

No need for any special behaviour, just always pass tx signature to callback after tx confirmation stage finished

What should the UI do after the getting a response from the callback

Just render it, as it would do it for the any other next action

Should only a single callback url for all these lifecycle events be supported? Or should the developer be able to define specific callback urls for each lifecycle event?

I think single would be already good for the start - developers can just check signature status on their backend to decide what should be done next

So (b) just gives a bit more options to developer to handle potential tx errors, we just need to always send tx sig to callback. Does it still feels like scope creep to you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) As you proposing, assume the action should be called again via the blink UI to attempt to repeat it. I think it will work fine, but can be a bit annoying if you have a long chain and want to repeat only last action.

I would not expect the blink-client to attempt to repeat the entire action chain from the start if the current action failed to confirm. That would be foolish lol. I would only the client to reattempt the current action.

No need for any special behaviour, just always pass tx signature to callback after tx confirmation stage finished

there would have to be some way to denote each of the lifecycle status you suggested ones though: tx confirmed, tx failed, and tx timed out.

A transaction can fail for many reason on the client side. "tx failed" would be a generic failure vs "tx timeout" would be a specific one like an expired blockhash. it could fail preflight checks performed by the blink-client and or wallet.

many causes of a transaction failing of which would result in no tx id existing on chain. Therefore not giving much useful info to the action api to actually handle various error states. there would not always be a tx id to give to process on-chain errored transaction

Does it still feels like scope creep to you?
It does, but if we opt for a simple design that we can flush out quickly, I'm okay with it being in this spec update. It will for sure be useful for devlopers!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long story short: I believe it's good enough to go with just reattempting the last action, we can improve error handling later as a separate sRFC.

I would not expect the blink-client to attempt to repeat the entire action chain from the start if the current action failed to confirm. That would be foolish lol. I would only the client to reattempt the current action.

Ahh, ok, then I've misunderstood. Reattempting the last action instead of the entire chain is good for the start

there would have to be some way to denote each of the lifecycle status you suggested ones though: tx confirmed, tx failed, and tx timed out.

In the future could still be a single callback, but with different payloads. Generally errors should have an error code + error message, that is useful info to pass to API together with the signature if it's available. E.g. see phantom deeplink callback https://docs.phantom.app/phantom-deeplinks/provider-methods/signandsendtransaction#reject. Not advocating to use the same structures, but this just gives an idea of data we could include.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, imo gtg with reattempting the last action

Let's skip error handling callback improvements and make them as a separate spec update, seems like we need broader discussion here

validation and logic on the Action API server

To chain multiple actions together, include a `links.next` value in the
`ActionPostResponse` payload of either:

- `string` - Same origin callback url to receive a `POST` request with the
`signature` and user's `account` in the body. This callback url should respond
with a `NextAction`.
- `NextAction` - The metadata for the next action to present to the user
immediately after the transaction has confirmed. No callback will be made.

```ts filename="NextActionPostRequest"
export interface NextActionPostRequest extends ActionPostRequest {
/** signature produced from the previous action (either a transaction id or message signature) */
signature: string;
}
```

After the `ActionPostResponse` included `transaction` is signed by the user and
confirmed on-chain, the blink client should either:

- execute the callback request to fetch and display the `NextAction`, or
- if a `NextAction` is already provided via `links.next`, the blink client
should update the displayed metadata and make no callback request

If the callback url is not the same origin as the initial POST request, no
callback request should be made. Blink clients should display an error notifying
the user.

```ts filename="NextAction"
/**
* The next action to be presented to the user after the previous action was successful
* (i.e. after the transaction was confirmed on-chain)
* - `action` - a regular action for the user to interact with
* - `completed` - metadata to update the blink UI with. end of the action chain.
*/
export type NextAction<T extends ActionType> = T extends "completed"
? Omit<Action<T>, "links">
: Action<T>;
```

A `NextAction` should be presented to the user via blink clients in one of two
ways, based on the `type`:

- `action` - (default) A standard action that will allow the user to see the
included Action metadata, interact with the provided `LinkedActions`, and
continue to chain any following actions.

- `completed` - The terminal state of an action chain that can update the blink
UI with the included Action metadata to the user, but will not allow the user
to execute further actions.

If no `links.next` is not provided, blink clients should assume the current
action is final action in the chain, presenting their "completed" UI state after
the transaction is confirmed.

### actions.json

The purpose of the [`actions.json` file](#actionsjson) allows an application to
Expand Down
Loading