Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-0057: Events #57

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
341 changes: 341 additions & 0 deletions proposals/0057-events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,341 @@
---
simd: '0057'
title: Events
authors:
- Maximilian Schneider ([email protected])
- Nicholas Clarke ([email protected])
category: Standard
type: Core
status: Review
created: 2023-06-13
feature: (fill in with feature tracking issues once accepted)
---

## Summary

Propose a new event logging mechanism that is reliable and efficient.
Browser clients should be able to receive emitted events with minimal latency.
3rd party indexers should allow to horizontally scale access to historic
events.

## Motivation

Application developers need an easy way to reliably track the on-chain state
of their programs. In theory all account updates can be streamed at line-rate
out of a validator, but clients often lack the bandwith to consume those.

Common use-cases for this would be: executing a market order and reporting the
filled price to the user. Collect all orders on a market and aggregate them to
hourly candles. Allow UIs to listen to L3 order updates instead of streaming
full account updates for a large orderbook account.

A new syscall exposed to the runtime should allow dapps to reliably record
semantic and materialized (not raw) historical data, which clients subscribe
to via RPC. Clients can filter the events subscribed to with various rules,
to only receive the updates that are needed for a consistent client experience.
For CLOBs UI bandwidth requirements would reduce from 4MB/s to below 300kB/s.


## Alternatives Considered

Anchor events are a very popular solution, they pass base64 encoded binary rust
structs as string to the `sol_log_` syscall. The issue with this process is
that the sol_log_ destination buffer is truncated on many RPCs for complex
transactions with a lot of logs. As the setting is an operator setting
some of them choose to have very large log buffers, but not every RPC operator
provides this service to their customers. It's actually important to ensure
that the RPCs feeding the bigtable historical transactions also had a large
buffer otherwise you will experience truncation fetching logs older than the
RPCs local history, even when you configured your own RPC with a large log
buffer. Another issue is that these logs are used to also record runtime
information (compute units consumed) and plaintext debug messages, hence become
very difficult to parse, which is why many developers avoid structured event
logging and just use formatted strings.

A lot of developers would prefer a reliable and universally compatible solution
so recently Anchor has added a new feature called cpi_events. This is a
convention on how to call your own program via CPI and encode a binary string
as instruction data. It can be indexed via the RPC Transaction response, as
innerInstruction data is always persisted. It is more reliable than sol_log_
but it effectively reduces runtime limits like maximum CPI call depth for the
actual application. It also causeses unnecessary overhead in replay as a new
execution context neeeds to be created, for something as trivial as a log
message. For the average program (1-2mb) that would be around 5000-9000 CU per
1kb log message.

In comparison to ethereum and cosmos, solana's preferred access for transaction
meta-data is via address filtering, e.g. `getSignaturesForAddress` call.

1. Topics: the ethereum yellow paper instead specifies that every event can be
attributed to up to 4 topics. Each topic is a 32 byte subscription key, users
can use to filter these events via RPC.

This seems restrictive compared to the expressivenes that is guaranteed through
the context of each instruction, program id and account metas, this data is
anyways available at the RPC node.

2. Schema: the cosmos sdk requires every event to have a protobuf schema, this
makes it very easy for third parties to come in and build generalized indexers
for the cosmos ecosystem. It also allows to specify complex filter queries that
can evaluated on the RPC node to stream or retrieve events.

Parsing the event payload is not a task for the rpc layer, and should instead
be handled by 3rd party providers. As schemas on solana are flexible and can
change with program updates, this would require access to historic schema
definitions / account state.

## New Terminology

Discriminator - a unique n byte sequence often a 8 byte hash of a method
signature or data type, that should be used to tag the type of a data structure
and encoded always in the very first bytes of an event, account or instruction
for the sake of efficient parsing.

## Detailed Design

### Runtime

In order to log an event, the SVM runtime must implement a new syscall
`sol_emit_`. The runtime must charge 100 + 0.25 * len CU per invocation to
account for the potential cost of including a blake3 hash of all log messages
in consensus.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a section specifying how this would be included in consensus, what's the plan here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no concrete plan yet. I wanted to leave that open for a future SIMD, as the roadmap towards light clients is not clear yet.


```
#include <sol/emit.h>
void sol_emit_(/* r1 */ const char *message,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const uint8_t * message, unless it's really only characters

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually why not an array of slices, that saves the bpf program from having to concatenate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like sol_log_data() with some RPC magic; why not reuse sol_log_data()?

In fact, sol_log_data() was introduced for events (with Solang)

Copy link
Author

@mschneider mschneider Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change of CU cost bc size restriction is no longer a client side decison. sol_log_ truncates for programs that log a lot of data depending on the validator configuration

/* r2 */ uint64_t len);
```

The validator should preserve these events similar to transaction history
and deliver them to it's operators via RPC & Geyser interfaces. Although
those are technically not part of the SIMD specification process, it's
important to publish a recommended spec for 3rd party providers so that Solana
Dapps don't run into vendor lock-in issues.

The cost for this operation is derived from a possible future inclusion of a
hash of all events in consensus. So future light-clients can verify the
authenticity of events received from third party rpc providers.

Bank deltas are currently hashed via sha256, the sha256 syscall charges
85 + 0.5 CU / byte. blake3 a roughly 2x faster to compute hash, would be
sufficient as well, hence 100 + 0.25 * len CU would be a good approximation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current best implementations of BLAKE3 are significantly slower than SHA-256 below 4kB
https://forum.solana.com/t/blake3-slower-than-sha-256-for-small-inputs/829

I have an experimental rewrite that makes BLAKE3 fast again, but that won't get backported to Rust anytime soon.
Therefore, I think we should use the SHA256 pricing instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

in this case. CPI for event logging would be only "cheaper" when using >4k
event payloads but most events are <512b in length. This cost enforces
theoretical limits to the amount of logs per transaction & block:
Limit per 200k CU: 800kB
Limit per 48M CU: 200MB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd to me that this limit, which is per block, is higher than the amount of state-growth we allow per block.
That to me implies that the CU cost is too low? why would we let events emit faster than our state growth if infrastructure has to store both?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to go with double the cost per byte, if that simplifies the process moving forward.


### RPC-Client

Clients should be able subscribe to events emitted with various filters, but
the restriction to a single program id is always required to prevent spoofing
of events:

```
rpc.onEvents(
programId: PublicKey,
callback: EventCallback,
commitment?: Commitment,
filters?: EventsFilter[]);
```

Each event emitted calls the callback, the context of the instruction that
emitted the event is referenced through:

- signature is the transaction's first signature
- suid is a sequence that follows the natural order of events as emitted
during the execution of a single transaction starting from 0
- outerInstructionIndex & innerInstructionIndex identify the instruction
which emitted the event

This ensures that clients can easily perform the following workflows:

1. deduplicate events received from multiple validators by the compound unique
index formed through (slot, signature, suid)
2. fetch the remaining details like account keys and instruction data
efficiently using `getTransaction`

```
EventCallback: ((
data: Buffer,
instructionContext: {
signature: TransactionSignature,
suid: number,
outerInstructionIndex: number,
innerInstructionIndex?: number,
}
context: {
slot: number
}
) => void)
```

DataSizeFilter and MemcmpFilter can be used to filter for specific event types
using their discriminator. SignatureFilter allows to restrict to events
emitted by an individual transaction, while waiting for it's confirmation.

```
EventsFilter: (DataSizeFilter|MemcmpFilter|SignatureFilter)
DataSizeFilter: {
dataSize: number;
}
MemcmpFilter: {
memcmp: {
bytes: string;
offset: number;
};
}
SignatureFilter : {
signature: TransactionSignature
}
```

Clients should be able to query signatures that emitted events with various
filters. Any query must be constrained to a single programId to allow for
efficient indexing. Options must allow to paginate the result so that response
size and latency for programs with a lot of confirmed transactions can be
controlled. Either before or after can be used. The signature and suid are
definining respecetively the beginning or end of the sequence which does not
include the corresponding event. When both before and after are used the
before option has priority, meaning the sequence starts one event before
the specified one and ends either when reaching the limit or with the event
after the specified one. The order of the returned event sequence is
always in inverse chronological order, regarding how transactions are placed
in entries and then in a block and more specifically sorted descending by the
compound key: slot, block_entry, transaction_index_in_entry, suid.

```
EventLookupFilter: (DataSizeFilter|MemcmpFilter)

rpc.getSignaturesForEvent(
programId: PublicKey,
filters?: EventLookupFilter[],
options?: {
commitment?: Commitment,
minContextSlot?: number,
limit?: number,
before?: {
signature: TransactionSignature,
suid: number,
},
after?: {
signature: TransactionSignature,
suid: number,
},
}
): Promise<TransactionSignature[]>
```

In addition clients should be able to query events directly and index them
reliably, without having to resolve signatures independently. Due to block
re-orgs it is recommened to populate the after option with the signature and
suid of the last known finalized event. Alternatively, the commitment level of
the whole response can be restricted to finalized.

```
rpc.getEvents(
programId: PublicKey,
filters?: EventLookupFilter[],
options?: {
commitment?: Commitment,
encoding: 'base64',
minContextSlot?: number,
limit?: number,
before?: {
signature: TransactionSignature,
suid: number,
},
after?: {
signature: TransactionSignature,
suid: number,
},
}
): Promise<EventResponse[]>


EventResponse: {
data: Buffer,
instructionContext: {
signature: TransactionSignature,
suid: number,
outerInstructionIndex: number,
innerInstructionIndex?: number,
}
context: {
slot: number,
confirmationStatus: "processed" | "confirmed" | "finalized"
}
}
```

For previewing transactions before signing clients should be able to receive
events when calling `simulateTransaction`. The response should be extended
to include an array of EventResults under the key "events".

```
EventResult: {
data: [string, "base64" | "base64+zstd"],
programId: string,
outerInstructionIndex: number,
innerInstructionIndex?: number,
}
```


### Geyser client

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we really should not be adding new interfaces in geyser that are also broken for non-rust languages.
we eventually need to fix the old ones, but imo we shouldn't add new ones that are also broken - please make the new Event structs ffi-safe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had a look at this PR anza-xyz/agave#2125 and updated the events struct, lmk if that works for you

Clients should be able receive all events emitted, as part of the transaction
notification:

```
#[derive(Clone, Debug, PartialEq)]
pub struct Event {
pub message: Vec<u8>;
pub suid: u32,
pub outerInstructionIndex: u16,
pub innerInstructionIndex: Option<u16>,
}

#[derive(Clone, Debug, PartialEq)]
pub struct TransactionStatusMeta {
pub status: TransactionResult<()>,
pub fee: u64,
pub pre_balances: Vec<u64>,
pub post_balances: Vec<u64>,
pub inner_instructions: Option<Vec<InnerInstructions>>,
pub log_messages: Option<Vec<String>>,
pub pre_token_balances: Option<Vec<TransactionTokenBalance>>,
pub post_token_balances: Option<Vec<TransactionTokenBalance>>,
pub rewards: Option<Rewards>,
pub loaded_addresses: LoadedAddresses,
pub return_data: Option<TransactionReturnData>,
pub compute_units_consumed: Option<u64>,
pub events: Option<Vec<Event>>
}
```

Once the current token programs have been migrated to emit events like ERC20
the pre_ / post_token_balances in TransactionStatusMeta could be deprecated.

## Impact

This will greatly simplify indexing and tooling for most of the complex
programs on solana. It will help new developers with EVM experience to onboard
to Solana quicker and 3rd party indexing providers to provide better data
services due to a standard way to parse application events. Examples would be
wallets that can easily parse and display incoming token transfers.
Once sufficiently adopted, events should superseed all spl-token program
related custom RPC features, like token balance changes etc.

## Security Considerations

The CU limit for sol_emit_ needs to be carefully adjusted given the heightened
indexing requirements for the RPC node. Events should never be truncated and
fully persist.

## Backwards Compatibility *(Optional)*

This feature will mostly supserseed the existing logging functionality for the
purpose of indexing, but developers will still need to debugging a program with
a quick text message, so it would be best to have them co-exist for as long as
possible.
Loading