Aura node architecture

The Aura node can be divided into four primary components: Ingester, Synchronizer, Backfiller, and API. These core workers operate continuously to ensure data consistency and availability.

Ingester and Backfiller: These components are responsible for populating the main database—RocksDB. The ingester handles real-time data ingestion, while the backfiller ensures that any missed data is retroactively inserted into RocksDB.
Synchronizer: This worker moves data from RocksDB to PostgreSQL, enabling more complex API queries that require relational data handling and advanced querying capabilities.
API: The API layer is responsible for serving data to external consumers, providing easy access to indexed information.

Additional Workers

In addition to the primary workers, there are six auxiliary workers available for specialized tasks:

Migrator
RawBackfiller
RawBackup
ColumnCopier
ColumnRemover
ForkDetector

Ingester

This module receives the latest updates from the Solana blockchain, indexes them, and stores them in the primary database—RocksDB.

The core function of this module is to retrieve updates from the data source (either Redis or TCP stream), parse the data, and save the relevant transaction and account information into RocksDB.

It can source the latest transactions and account updates via Redis or a direct TCP connection, depending on the configuration of the message_source parameter. While both options are supported, using Redis has a significant advantage: during Aura node updates, no data will be lost. Redis stores all updates during downtime, eliminating the need to re-parse the entire accounts snapshot to restore the current network state after an update.

Data processing

The Aura node processes both account updates and transactions. Account updates are essential for handling NFTs created with the MPL Metadata or MPL Core Solana programs. On the other hand, transactions are needed for processing compressed NFTs (cNFTs), as all cNFT data is stored within instruction arguments and events.

The system uses two key components for parsing and storing data into RocksDB:

AccountsProcessor: Responsible for parsing and saving account-related data.
BubblegumTxProcessor: Handles the processing of compressed NFT transactions, saving relevant data to RocksDB. Both of these workers exclusively save data to RocksDB.

Additionally, there is a BatchMintPersister, which processes batch minting operations through a separate queue from accounts and transactions. Its task is to download a JSON file containing the assets to be minted, reconstruct the local Merkle tree, and verify that the provided data in the JSON file is accurate. If everything checks out, it stores the assets from the file into RocksDB.

Consistency checks

In addition to the main ingester task, which collects the latest NFT updates, there are several auxiliary workers responsible for identifying and filling gaps in the data. These workers include SignatureFetcher, SequenceConsistentGapFiller, and ForkCleaner.

SignatureFetcher: This worker fetches all transaction signatures for the Bubblegum program using the Solana RPC. As new transactions are processed (from sources like TCP or Redis), their signatures are stored in RocksDB. SignatureFetcher compares the signatures in RocksDB with those returned by the RPC. If discrepancies are found, it indicates a gap in the data. When a gap is detected, SignatureFetcher retrieves the missing transactions from the RPC and processes them to maintain data consistency.
SequenceConsistentGapFiller: This worker detects gaps in Merkle tree sequences. Each cNFT update action (e.g., transfer or update) increments the tree sequence, and these sequences are tracked in RocksDB for each tree. If a sequence is missing (e.g., seq n+1 is skipped), the worker signals the need to reprocess blocks in the range where the gap occurred (i.e., from seq n to seq n+2). It pushes the affected block numbers to the force_reingestable_slots column family. Another worker then iterates over this column family, downloading and parsing the required blocks to fill the sequence gap.
ForkCleaner: This worker periodically checks the LeafSignature column family in RocksDB, which stores signatures for all processed transactions. ForkCleaner looks for signatures that exist in forks, typically indicated by identical signatures with different slots and Merkle tree sequences. Upon detecting a fork, ForkCleaner removes entries from CLItems that correspond to forked sequences and deletes those sequences from the TreeSeqIdx column family. This allows SequenceConsistentGapFiller to identify the missing sequences and trigger the reprocessing of affected transactions.

Backfiller

Backfiller is a background job which can do different tasks, depends on it's mode. Some of them are one-time jobs and some continious.

It has four modes of operation. Below, you can find a brief description for each of them.

Ingest Directly

This is a one-time job.

The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.

The produced item is the BackfillSource. The inner object can either be a BigTable client or an RPC client.

It launches the SlotsCollector with parameters to start from and parse until to collect slots (u64 numbers) for a pubkey.

The SlotsCollector saves slots to the BubblegumSlots Rocks CF.

Then the TransactionsParser is launched. It uses the BubblegumSlotGetter to get slots to process from the BubblegumSlots CF.

The block producer here is the BackfillSource.

It processes blocks, saves transaction results, and then drops the numbers of processed slots from the Rocks BubblegumSlots and adds them to the IngestableSlots CFs.

It doesn’t save any parameters and doesn’t save raw blocks.

Persist

This is a one-time job.

The consumer is RocksDB.

The producer is the BackfillSource (either BigTable or RPC).

Slots to start from and parse until will be taken from the config.

The SlotsCollector collects slots and saves them to the BubblegumSlots Rocks CF.

The TransactionsParser is launched to get the block by slot number and persists it to the Rocks RawBlock CF.

Once a block is persisted, its number is dropped from BubblegumSlots and added to the IngestableSlots Rocks CF.

It doesn’t save any parameters.

Ingest Persisted

This is a one-time job.

The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.

The producer is RocksDB.

At the beginning, the TransactionsParser takes the slot to start from. It can take this value either from the config or it will start the iteration from the beginning of the raw_blocks_cbor Rocks CF.

For the DirectBlockParser, the already_processed_slot function always returns false, so it will parse everything.

The block is extracted from the producer - RocksDB.

The consumer receives the block and processes it. More specifically, it parses transactions, calls get_ingest_transaction_results() to get TransactionResult, and saves it to the Rocks.

Once it has parsed all the blocks, it saves the maximum slot number to the LastFetchedSlot RocksDB parameter. This allows us to restart the backfiller in PersistAndIngest mode and it will start collecting new slots and blocks we don't have yet in the DB.

Once it finishes its job, it will not do any post backfill jobs.

Persist And Ingest

This is a continuous job.

Three workers are running in this mode: slot collector, block fetcher and saver, and block parsing.

Perpetual Slot Collection

The consumer is Rocks.

The producer for slots is the BackfillSource (BigTable or RPC).

It takes the parse_until slot from RockDB LastFetchedSlot parameter. If there is no value, it takes it from the config.

Slot numbers are saved to the BubblegumSlots Rocks CF.

Perpetual Slot Processing

The consumer is RocksDB.

The producer is the BackfillSource (BigTable or RPC).

From the BubblegumSlots Rocks CF, slots are extracted, and then the block is downloaded with the help of the BackfillSource.

Once the block is downloaded and saved, the slot is dropped from the BubblegumSlots CF and also this slot is added to the IngestableSlots CF so the next worker could parse it.

Perpetual Block Ingestion

The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.

The producer is RocksDB.

The IngestableSlotGetter returns slots from the IngestableSlots CF, then blocks are extracted from the Rocks.

Once a block is received, it’s parsed, and the slot is dropped from the IngestableSlots CF.

Json processor

This worker is responsible for downloading JSON files. During database synchronization, the Synchronizer assigns tasks to download any missing JSONs. The JSON Processor handles these tasks by retrieving them from PostgreSQL and then storing the downloaded JSON files into RocksDB.

Backup worker

Coming soon...

GRPC client

Coming soon...

Synchronizer

Since the Aura node uses two databases—RocksDB and PostgreSQL—a tool is needed to ensure data consistency between them. For this, the AssetsUpdateIdx column family in RocksDB stores all asset update indexes. These indexes consist of the sequence number, slot, and account public key. The sequence is an internal counter tracking updates and is unrelated to the Merkle tree sequence. Every update that the ingester processes is saved to this column family.

On the PostgreSQL side, the same index is stored to track the latest synchronized update.

API

Coming soon...

Migrator

Legacy utilite to migrate JSONs and tasks from one data base to another one.

Raw backfiller

This is a separate tool that performs all the functions of the Backfiller from the Ingester, except for direct ingestion. It shares the same codebase as the Ingester.

The tool is primarily used for downloading large amounts of raw blocks or parsing blocks that have already been downloaded. A separate binary was created to allow these processes to run independently. It’s typically necessary to use this tool when setting up a new node.

In the scripts/ directory, you’ll find two bash scripts to execute these processes:

run-ingest-persisted (for parsing already downloaded blocks)
run-slots-persisting (for downloading and saving raw blocks)

Raw backup

This tool creates backups of raw blocks and JSONs. It works by iterating through all blocks and JSONs in the source RocksDB and copying them to the target RocksDB. This is particularly useful when you want to create a backup of raw, non-indexed data. The backup can later be used by the Aura node itself or by other indexers that utilize different data structures.

Column copier

This tool copies column data from one RocksDB to another. The source database is opened in secondary mode. It's primarily a development tool, useful for debugging purposes.

Column remover

As the name suggests, this tool is used to drop specific columns from RocksDB. Like the Column Copier, it's mainly a development tool, helpful for bug fixing and debugging various cases.

Fork detector

This binary is designed to detect transactions that were part of a fork, particularly identifying cNFTs that were updated in these forked transactions.

The script was necessary because the previous fork cleaner could incorrectly handle data removal when a fork occurred. Specifically, if the same asset is updated in multiple blocks (one of which is forked) and those blocks have different sequences, the cleaner doesn't properly resolve the discrepancy. It may remove one sequence but leave the other, which can lead to problems. If the sequence from the forked block (which may be higher) is dropped, the tool won’t backfill the lower sequence that was accepted by the majority of validators.

It's important to run this binary with the indexer turned off.

Once a fork is detected, the binary removes the corresponding sequences. Afterward, when the indexer is relaunched, the SequenceConsistentGapFiller identifies any gaps in the sequences and fills them appropriately.

The current version of the fork cleaner handles forks efficiently, so this tool doesn't need to be run continuously.

Aura storage

As mentioned earlier, the Aura node utilizes two types of storage: RocksDB and PostgreSQL. RocksDB serves as the primary storage for all processed data, while PostgreSQL functions as an index storage solution for complex API queries, such as searchAsset.

Below is a description of the data stored in each database.

RocksDB

AssetStaticDetails

Stores static information about assets, such as immutable properties.

Key

asset pubkey

Fields

pubkey
specification_asset_class
royalty_target_type
created_at
edition_address

AssetDynamicDetails

Holds dynamic details of assets.

Key

asset pubkey

Fields

pubkey
is_compressible
is_compressed
is_frozen
supply
seq
is_burnt
was_decompressed
onchain_data
creators
royalty_amount
url
chain_mutability
lamports
executable
metadata_owner
raw_name
mpl_core_plugins
mpl_core_unknown_plugins
rent_epoch
num_minted
current_size
plugins_json_version
mpl_core_external_plugins
mpl_core_unknown_external_plugins

MetadataMintMap

Stores a mapping between metadata and the mint accounts to not calculate metadata key each time.

Key

metadata pubkey

Fields

pubkey
mint_key

AssetAuthority

Stores data related to the authority or control over the asset.

Key

asset pubkey

Fields

pubkey
authority
slot_updated
write_version

AssetOwner

Contains data about the current owner of each asset.

Key

asset pubkey

Fields

pubkey
owner
delegate
owner_type
owner_delegate_seq

AssetLeaf

Stores leaf data related to assets as part of a Merkle tree structure.

Key

asset pubkey

Fields

pubkey
tree_id
leaf
nonce
data_hash
creator_hash
leaf_seq
slot_updated

AssetCollection

Contains collection-level data for assets that belong to specific collections.

Key

asset pubkey

Fields

pubkey
collection
is_collection_verified
authority

OffChainData

Stores off-chain data associated with the assets, such as metadata(JSON file).

Key

url

Fields

url
metadata

ClItem

Holds CLItem data emited by Account compression program during instruction execution.

Key

Merkle tree node id + tree pubkey

Fields

cli_node_idx
cli_tree_key
cli_leaf_idx
cli_seq
cli_level
cli_hash
slot_updated

ClLeaf

Stores leaf nodes from the Merkle tree.

Key

Merkle tree node id + tree pubkey

Fields

cli_leaf_idx
cli_tree_key
cli_node_idx

BubblegumSlots

Stores slots numbers with transactions related to Bubblegum program.

Key

slot number

IngestableSlots

Stores slots numbers that require to be processed. This column family is populated by backfiller if it works either in IngestPersisted or PersistAndIngest mode.

Key

slot number

ForceReingestableSlots

Stores slots numbers which we have to re-parse because of gap in data. This column family is used only if sequence_consistent_checker is active. If it found a gap in tree sequence it writes slots which has to be re-parse to this column. And then slot_force_persister is iterating over these slots, download blocks and parse them.

Key

slot number

RawBlock

Stores raw block data in CBOR format.

Key

block number

Fields

data

AssetsUpdateIdx

Stores the index of updated assets. This column is used by synchronizer to keep RocksDB and PostgreSQL in sync.

Key

sequence + slot + pubkey

SlotAssetIdx

Maps slots to asset updates.

Key

slot + pubkey

TreeSeqIdx

Stores the sequence index for Merkle trees. Every sequence update is saved here.

Key

tree pubkey + sequence

Fields

slot

TreesGaps

Stores pubkeys of trees which has gaps in sequences.

Key

tree pubkey

TokenMetadataEdition

Stores either Edition or MasterEdition information.

Key

asset pubkey

Fields

Edition

key
parent
edition
write_version

MasterEdition

key
supply
max_supply
write_version

TokenAccount

Contains data about token accounts

Key

asset pubkey

Fields

pubkey
mint
delegate
owner
frozen
delegated_amount
slot_updated
amount
write_version

TokenAccountOwnerIdx

Stores bool flag is wallet's token balance is zero or not.

Key

owner wallet + token account pubkey

Fields

is_zero_balance
write_version

TokenAccountMintOwnerIdx

Stores bool flag is wallet's token balance is zero or not. But compared to TokenAccountOwnerIdx data sorted by mint as well.

Key

mint + owner + token account

Fields

is_zero_balance
write_version

AssetSignature

Stores compressed assets signatures.

Key

key + leaf id + sequence

Fields

transaction signature
instruction name
slot

BatchMintToVerify

A queue for batch mint operations to process.

Key

file hash

Fields

file_hash
url
created_at_slot
signature
download_attempts
persisting_state
staker
collection_mint

FailedBatchMint

Stores batch mints which did not pass verification.

Key

status + file hash

Fields

status
file_hash
url
created_at_slot
signature
download_attempts
staker

BatchMintWithStaker

Stores downloaded batch mint information.

Key

file hash

Fields

batch_mint
- tree_id
- batch_mints
- raw_metadata_map
- max_depth
- max_buffer_size
staker

MigrationVersions

Stores RocksDB migration version.

Key

version number

TokenPrice

Stores token prices in USD by its symbol.

Key

token symbol

Fields

price

AssetPreviews

Represents information about asset preview stored on Storage service.

Key

asset's url hash

Fields

size
failed

UrlToDownload

Rocks DB column family that is used as a queue for asset URLs, to be sent to Storage service, where they are downloaded and saved as previews.

Key

url

Fields

timestamp
download_attempts

ScheduledJob

Represents information about background job that can be one time job, or a scheduled job that is launched recurrently with a given interval.

Key

job id

Fields

job_id
run_interval_sec
last_run_epoch_time
last_run_status
state

Inscription

Stores information about token inscriptions.

Key

asset pubkey

Fields

authority
root
content_type
encoding
inscription_data_account
order
size
validation_hash
write_version

InscriptionData

Stores raw inscription data.

Key

asset pubkey

Fields

pubkey
data
write_version

LeafSignature

This column family contains sequence updates for each leaf in the tree.

Key

set of Signature+TreeId+leafId

Fields

data: hash map with slots and sequences

PostgreSQL

Below you can find short description of tables Aura node has in PorsgreSQL.

Asset creators

Stores asset creators.

pubkey
creator
verified
slot_updated

Asset authorities

Stores asset authorities.

pubkey
authority
slot_updated

Asset

Stores all the asset info.

pubkey
specification_version
specification_asset_class
royalty_target_type
royalty_amount
slot_created
owner
owner_type
delegate
collection
is_collection_verified
is_burnt
is_compressible
is_compressed
is_frozen
supply
metadata_url_id
slot_updated
authority_fk

Batch mints

Batch mints queue.

file_name
state
error
url
tx_reward
created_a

Last synced key

Stores last synced asset. Used by synchronizer for data bases synchronization.

id
last_synced_asset_update_key

Tasks

Stores tasks for json downloader to process NFTs metadata.

metadata_url
status
locked_until
attempts
max_attempts
error
id

Aura node architecture

Aura node architecture

Ingester

Data processing

Consistency checks

Backfiller

Ingest Directly

Persist

Ingest Persisted

Persist And Ingest

Perpetual Slot Collection

Perpetual Slot Processing

Perpetual Block Ingestion

Json processor

Backup worker

GRPC client

Synchronizer

API

Migrator

Raw backfiller

Raw backup

Column copier

Column remover

Fork detector

Aura storage

RocksDB

AssetStaticDetails

AssetDynamicDetails

MetadataMintMap

AssetAuthority

AssetOwner

AssetLeaf

AssetCollection

OffChainData

ClItem

ClLeaf

BubblegumSlots

IngestableSlots

ForceReingestableSlots

RawBlock

AssetsUpdateIdx

SlotAssetIdx

TreeSeqIdx

TreesGaps

TokenMetadataEdition

TokenAccount

TokenAccountOwnerIdx

TokenAccountMintOwnerIdx

AssetSignature

BatchMintToVerify

FailedBatchMint

BatchMintWithStaker

MigrationVersions

TokenPrice

AssetPreviews

UrlToDownload

ScheduledJob

Inscription

InscriptionData

LeafSignature

PostgreSQL

Asset creators

Asset authorities

Asset

Batch mints

Last synced key

Tasks

Clone this wiki locally