Skip to content

Commit

Permalink
Merge pull request #117 from shruggr/main
Browse files Browse the repository at this point in the history
fix: check for nil BlockHash in GetTransaction
  • Loading branch information
boecklim authored Oct 28, 2023
0 parents commit 9e74056
Show file tree
Hide file tree
Showing 30 changed files with 3,378 additions and 0 deletions.
Empty file added .nojekyll
Empty file.
89 changes: 89 additions & 0 deletions BIP-239.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
<pre>
BIP: 239
Layer: Applications
Title: Transaction Extended Format (TEF)
Author:
Simon Ordish (@ordishs)
Siggi Oskarsson (@icellan)
Comments-Summary: No comments yet.
Comments-URI: -
Status: Proposal
Type: Standards Track
Created: 2022-11-09
</pre>

## Abstract

Regular Bitcoin transactions do not contain all the data that is needed to verify that the signatures in the transactions are valid. To sign an input of a Bitcoin transaction, the signer needs to know the transaction ID, output index, output satoshis and the locking script of the input transaction. When sending a Bitcoin transaction to a node, only the previous transaction ID and the output index are part of the serialized transaction, the node will look up the locking script and output amount of the input transaction.

We propose an Extended Format (EF) for a Bitcoin transaction, that includes the locking script and the amount in satoshis of all inputs of the transaction. This allows a broadcast service to validate all aspects of a transaction without having to contact a node or an indexer for the utxos of the inputs of a transaction, speeding up the validation.

## Copyright

This BIP is licensed under the Open BSV license.

## Motivation

Verifying that a transaction is valid, including all signatures, is not possible at the moment without getting the unspent transaction outputs (utxos) from the transactions that are used as inputs from a Bitcoin node (or a Bitcoin indexer). This lookup of the utxos always happens inside a Bitcoin node when validating a transaction, but for a broadcast service to be able to fully validate a transaction (including the fee being paid) it also needs to look up the utxos being spent, which complicates scalability, since this lookup needs to happen on a node (via RPC), that might be too busy to react within an acceptable time frame.

A broadcast service would be able to validate a transaction almost in full if the sender would also send the missing data (previous locking scripts and satoshi outputs) from the utxos being used in the transaction. When creating a new transaction, the previous locking scripts and satoshi outputs are needed to be able to properly sign the transaction, so the missing data is available at the time of the transaction creation. Serializing the transaction to Extended Format, instead of the standard format, is at the point of creating the transaction no extra work, but does make it much easier for a broadcast service to validate the transaction when being received, before sending the transaction to a node.

The main motivation for this proposal is therefore scalability. When incoming transactions contain all the data that is needed to validate them, without having to contact an external service for missing data, the broadcast service becomes much more scalable.

## Specification

Current Transaction format:

| Field | Description | Size |
|-----------------|------------------------------------------------------|--------------------------------------------------|
| Version no | currently 2 | 4 bytes |
| In-counter | positive integer VI = [[VarInt]] | 1 - 9 bytes |
| list of inputs | Transaction Input Structure | <in-counter> qty with variable length per input |
| Out-counter | positive integer VI = [[VarInt]] | 1 - 9 bytes |
| list of outputs | Transaction Output Structure | <out-counter> qty with variable length per output |
| nLocktime | if non-zero and sequence numbers are < 0xFFFFFFFF: block height or timestamp when transaction is final | 4 bytes |

The Extended Format adds a marker to the transaction format:

| Field | Description | Size |
|-----------------|--------------------------------------------------------------------------------------------------------|---------------------------------------------------|
| Version no | currently 2 | 4 bytes |
| **EF marker** | **marker for extended format** | **0000000000EF** |
| In-counter | positive integer VI = [[VarInt]] | 1 - 9 bytes |
| list of inputs | **Extended Format** transaction Input Structure | <in-counter> qty with variable length per input |
| Out-counter | positive integer VI = [[VarInt]] | 1 - 9 bytes |
| list of outputs | Transaction Output Structure | <out-counter> qty with variable length per output |
| nLocktime | if non-zero and sequence numbers are < 0xFFFFFFFF: block height or timestamp when transaction is final | 4 bytes |

The Extended Format marker allows a library that supports the format to recognize that it is dealing with a transaction in extended format, while a library that does not support extended format will read the transaction as having 0 inputs, 0 outputs and a future nLock time. This has been done to minimize the possible problems a legacy library will have when reading the extended format. It can in no way be recognized as a valid transaction.

The input structure is the only additional thing that is changed in the Extended Format. The current input structure looks like this:

| Field | Description | Size |
|---------------------------|---------------------------------------------------------------------------------------------|-------------------------------|
| Previous Transaction hash | TXID of the transaction the output was created in | 32 bytes |
| Previous Txout-index | Index of the output (Non negative integer) | 4 bytes |
| Txin-script length | Non negative integer VI = VarInt | 1 - 9 bytes |
| Txin-script / scriptSig | Script | <in-script length>-many bytes |
| Sequence_no | Used to iterate inputs inside a payment channel. Input is final when nSequence = 0xFFFFFFFF | 4 bytes |

In the Extended Format, we extend the input structure to include the previous locking script and satoshi outputs:

| Field | Description | Size |
|--------------------------------|---------------------------------------------------------------------------------------------|---------------------------------|
| Previous Transaction hash | TXID of the transaction the output was created in | 32 bytes |
| Previous Txout-index | Index of the output (Non negative integer) | 4 bytes |
| Txin-script length | Non negative integer VI = VarInt | 1 - 9 bytes |
| Txin-script / scriptSig | Script | <in-script length>-many bytes |
| Sequence_no | Used to iterate inputs inside a payment channel. Input is final when nSequence = 0xFFFFFFFF | 4 bytes |
| **Previous TX satoshi output** | **Output value in satoshis of previous input** | **8 bytes** |
| **Previous TX script length** | **Non negative integer VI = VarInt** | **1 - 9 bytes** |
| **Previous TX locking script** | **Script** | **\<script length>-many bytes** |

## Backward compatibility

The Extended Format is not backwards compatible, but has been designed in such a way that existing software should not read a transaction in Extend Format as a valid (partial) transaction. The Extended Format header (0000000000EF) will be read as an empty transaction with a future nLock time in a library that does not support the Extended Format.

## Implementation

The Extended Format has been implemented in [go-bt](https://github.com/libsv/go-bt) and a standalone JavaScript library [bitcoin-ef](https://github.com/TAAL-GmbH/bitcoin-ef).
Binary file added Bitcoin Arc Architecture - January 2023.pdf
Binary file not shown.
Binary file added Bitcoin_Arc_Architecture_-_January_2023.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
308 changes: 308 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,308 @@
# ARC
> Transaction processor for Bitcoin
## Overview

ARC is a transaction processor for Bitcoin that keeps track of the life cycle of a transaction as it is processed by
the Bitcoin network. Next to the mining status of a transaction, ARC also keeps track of the various states that a
transaction can be in, such as `ANNOUNCED_TO_NETWORK`, `SENT_TO_NETWORK`, `SEEN_ON_NETWORK`, `MINED`, `REJECTED`, etc.

If a transaction is not `SEEN_ON_NETWORK` within a certain time period (60 seconds by default), ARC will re-send the
transaction to the Bitcoin network. ARC also monitors the Bitcoin network for transaction and block messages, and
will notify the client when a transaction has been mined, or rejected.

Unlike other transaction processors, ARC broadcasts all transactions on the p2p network, and does not rely on the rpc
interface of a Bitcoin node. This makes it possible for ARC to connect and broadcast to any number of nodes, as many
as are desired. In the future, ARC will be also able to send transactions using ipv6 multicast, which will make it
possible to connect to a large number of nodes without incurring large bandwidth costs.

ARC consists of four microservices: [API](#API), [Metamorph](#Metamorph), [BlockTx](#BlockTx) and [Callbacker](#Callbacker), which are all described below.

All the microservices are designed to be horizontally scalable, and can be deployed on a single machine or on multiple machines. Each one has been programmed with a store interface and various databases can be used to store data. The default store is sqlite3, but any database that implements the store interface can be used.

![Architecture](./Bitcoin_Arc_Architecture_-_January_2023.png)

### API

API is the REST API microservice for interacting with ARC. See the [API documentation](/arc/api.html) for more information.

The API takes care of authentication, validation, and sending transactions to Metamorph. The API talks to one or more Metamorph instances using client-based, round robin load balancing.

### Metamorph

Metamorph is a microservice that is responsible for processing transactions sent by the API to the Bitcoin network. It
takes care of re-sending transactions if they are not acknowledged by the network within a certain time period (60
seconds by default).

Metamorph is designed to be horizontally scalable, with each instance operating independently and having its own
transaction store. As a result, the metamorphs do not communicate with each other and remain unaware of each other's existence.

### BlockTx

BlockTx is a microservice that is responsible for processing blocks mined on the Bitcoin network, and for propagating
the status of transactions to each Metamorph that has subscribed to this service.

The main purpose of BlockTx is to de-duplicate processing of (large) blocks. As an incoming block is processed by BlockTx, each Metamorph is notified of transactions that they have registered an interest in. BlockTx does not store the transaction data, but instead stores only the transaction IDs and the block height in which
they were mined. Metamorph is responsible for storing the transaction data.

### Callbacker

Callbacker is a very simple microservice that is responsible for sending callbacks to clients when a transaction has
been accepted by the Bitcoin network. To register a callback, the client must add the `X-CallbackUrl` header to the
request. The callbacker will then send a POST request to the URL specified in the header, with the transaction ID in
the body. See the [API documentation](/arc/api.html) for more information.

## Extended format

For optimal performance, ARC uses a custom format for transactions. This format is called the extended format, and is a
superset of the raw transaction format. The extended format includes the satoshis and scriptPubKey for each input,
which makes it possible for ARC to validate the transaction without having to download the parent transactions. In most
cases the sender already has all the information from the parent transaction, as this is needed to sign the transaction.

The only check that cannot be done on a transaction in the extended format is the check for double spends. This can
only be done by downloading the parent transactions, or by querying a utxo store. A robust utxo store is still in
development and will be added to ARC when it is ready. At this moment, the utxo check is performed in the Bitcoin
node when a transaction is sent to the network.

With the successful adoption of Bitcoin ARC, this format should establish itself as the new standard of interchange
between wallets and non-mining nodes on the network.

The extended format has been described in detail in [BIP-239](BIP-239).

The following diagrams show the difference between validating a transaction in the standard and extended format:

```plantuml
@startuml
hide footbox
skinparam ParticipantPadding 15
skinparam BoxPadding 100
actor "client" as tx
box ARC
participant api
participant validator
participant metamorph
database "bitcoin" as bsv
end box
title Submit transaction (standard format)
tx -> tx: create tx
tx -> tx: <font color=red><b>add utxos</b></font>
tx -> tx: add outputs
tx -> tx: sign tx
tx -> api ++: raw tx (standard)
loop for each input
api -> bsv ++: <font color=red><b>get utxos (RPC)</b></font>
return previous tx <i>or Missing Inputs</i>
end
api -> validator ++: validate tx
return ok
api -> metamorph ++: send tx
metamorph -> bsv
return status
return status
@enduml
```

```plantuml
@startuml
hide footbox
skinparam ParticipantPadding 15
skinparam BoxPadding 100
actor "client" as tx
box ARC
participant api
participant validator
participant metamorph
database "bitcoin" as bsv
end box
title Submit transaction (extended format)
tx -> tx: create tx
tx -> tx: add utxos
tx -> tx: add outputs
tx -> tx: sign tx
tx -> api ++: raw tx (extended)
api -> validator ++: validate tx
return ok
api -> metamorph ++: send tx
metamorph -> bsv
return status
return status
@enduml
```

As you can see, the extended format is much more efficient, as it does not require any RPC calls to the Bitcoin node.

This validation takes place in the ARC API microservice. The actual utxos are left to be checked by the Bitcoin node
itself, like it would do anyway, regardless of where the transactions is coming from. With this process flow we save
the node from having to lookup and send the input utxos to the ARC API, which could be slow under heavy load.

## Settings

The settings available for running ARC are managed by [viper](github.com/spf13/viper). The settings are by default defined in `config.yaml`.

## ARC stats

`gocore` keeps real-time stats about the metamorph servers, which can be viewed at `/stats` (e.g. `http://localhost:8011/stats`).
These stats show aggregated information about a metamorph server, such as the number of transactions processed, the number of
transactions sent to the Bitcoin network, etc. It also shows the average time it takes for each step in the process.

More detailed statistics are available at `/pstats` (e.g. `http://localhost:8011/pstats`). These stats show information
about the internal metamorph processor. The processor stats also allows you to see details for a single transaction. If
a transaction has already been mined, and evicted from the processor memory, you can still see the stored stats
retrieved from the data store, and potentially the timing stats, if they are found in the log file.

ARC can also expose a Prometheus endpoint that can be used to monitor the metamorph servers. Set the `prometheusEndpoint`
setting in the settings file to activate prometheus. Normally you would want to set this to `/metrics`.

## Client Libraries

### Javascript

A typescript library is available in the [arc-client](https://github.com/bitcoin-sv/arc-client-js) repository.

Example usage:

```javascript
import { ArcClient } from '@bitcoin-a/arc-client';

const arcClient = new ArcClient({
host: 'localhost',
port: 8080,
authorization: '<api-key>'
});

const txid = 'd4b0e1b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0c0b0';
const result = await arcClient.getTransactionStatus(txid);
```

See the repository for more information.

## Process flow diagrams

```plantuml
@startuml
hide footbox
skinparam ParticipantPadding 15
skinparam BoxPadding 10
actor "client" as tx
box api server
participant handler
participant auth
participant validator
end box
box metamorph
participant grpc
participant worker
database store
participant "peer\nhandler" as peer
end box
database "bitcoin\nnetwork" as bsv
title Submit transaction via P2P
tx -> handler ++: extended\nraw tx
handler -> auth ++: apikey
return
handler -> validator ++: tx
return success
handler -> grpc ++: tx
grpc -> worker ++: tx
worker -> store++: register txid
worker -> store: tx
return STORED
worker --> grpc: STORED
worker -> peer: txid
peer -> bsv: INV txid
worker -> store: ANNOUNCED
worker --> grpc: ANNOUNCED
bsv -> peer++: GETDATA txid
peer -> store ++ : get tx
store -> worker : SENT
return raw tx
worker -> store: SENT
worker --> grpc: SENT
return tx
bsv -> peer: INV txid
peer -> worker: SEEN
worker -> store: SEEN
return status
grpc -> grpc: wait for SENT\nor TIMEOUT
return last status
@enduml
```

```plantuml
@startuml
hide footbox
skinparam ParticipantPadding 15
skinparam BoxPadding 10
box metamorph
participant grpc
participant worker
database store
participant "peer\nhandler" as mpeer
end box
box blocktx
participant "worker" as blocktx
database blockstore
participant "peer\nhandler" as peer
end box
database "bitcoin\nnetwork" as bsv
title Process block via P2P
bsv -> peer++: BLOCK blockhash
peer -> blocktx++: blockhash
blocktx -> peer: get block
peer -> bsv: GETDATA blockhash
bsv -> peer: BLOCK block
peer -> blocktx--: block
blocktx -> blockstore: block
blocktx -> blockstore: txids
blocktx -> worker++: blockhash
worker -> blocktx: get txs in block
blockstore -> blocktx: txids
blocktx -> worker--: txids
worker -> store: mark txs mined
@enduml
```
Binary file added README.pdf
Binary file not shown.
Loading

0 comments on commit 9e74056

Please sign in to comment.