Skip to content

Latest commit

 

History

History
417 lines (340 loc) · 24.5 KB

13_wire_protocol.asciidoc

File metadata and controls

417 lines (340 loc) · 24.5 KB

Wire Protocol: Framing & Extensibility

In this chapter, we dive into the wire protocol of the Lightning network, and also cover all the various extensibility levers that have been built into the protocol. By the end of this chapter, and aspiring reader should be able to write their very own wire protocol parser for the Lighting Network. In addition to being able to write a custom wire protocol parser, a reader of this chapter will gain a deep understanding with respect of the various upgrade mechanisms that have been built into the protocol.

Messaging Layer in the Lightning protocol suite

The messaging layer, which is detailed in this chapter, consists of Message Framing and Format, Type Length Value (TLV) encoding, and Feature Bits. These components are highlighted by a double outline in the protocol suite, shown in The Lightning Network Protocol Suite:

The Lightning Network Protocol Suite
Figure 1. The Lightning Network Protocol Suite

Wire Framing

We begin by describing the high level structure of the wire framing within the protocol. When we say framing, we mean the way that the bytes are packed on the wire to encode a particular protocol message. Without knowledge of the framing system used in the protocol, a string of bytes on the wire would resemble a series of random bytes as no structure has been imposed. By applying proper framing to decode these bytes on the wire, we’ll be able to extract structure and finally parse this structure into protocol messages within our higher-level language.

It’s important to note that as the Lightning Network is an end to end encrypted protocol, and the wire framing is itself encapsulated within an encrypted message transport layer. As we see in [encrypted_message_transport] the Lighting Network uses a custom variant of the Noise protocol to handle transport encryption. Within this chapter, whenever we give an example of wire framing, we assume the encryption layer has already been stripped away (when decoding), or that we haven’t yet encrypted the set of bytes before we send them on the wire (encoding).

High-Level Wire Framing

With that said, we’re ready to describe the high-level schema used to encode messages on the wire:

  • Messages on the wire begin with a 2 byte type field, followed by a message payload.

  • The message payload itself, can be up to 65 KB in size.

  • All integers are encoded in big-endian (network order).

  • Any bytes that follow after a defined message can be safely ignored.

Yep, that’s it. As the protocol relies on an encapsulating transport protocol encryption layer, we don’t need an explicit length for each message type. This is due to the fact that transport encryption works at the message level, so by the time we’re ready to decode the next message, we already know the total number of bytes of the message itself. Using 2 bytes for the message type (encoded in big-endian) means that the protocol can have up to 2^16 - 1 or 65535 distinct messages. Continuing, as we know all messages MUST be below 65KB, this simplifies our parsing as we can use a fixed sized buffer, and maintain strong bounds on the total amount of memory required to parse an incoming wire message.

The final bullet point allows for a degree of backwards compatibility, as new nodes are able to provide information in the wire messages that older nodes (which may not understand them can safely ignore). As we see below, this feature combined with a very flexible wire message extensibility format also allows the protocol to achieve forwards compatibility as well.

Type Encoding

With this high level background provided, we now start at the most primitive layer: parsing primitive types. In addition to encoding integers, the Lightning Protocol also allows for encoding of a vast array of types including: variable length byte slices, elliptic curve public keys, Bitcoin addresses, and signatures. When we describe the structure of wire messages later in this chapter, we refer to the high-level type (the abstract type) rather than the lower level representation of said type. In this section, we peel back this abstraction layer to ensure our future wire parser is able to properly encoding/decode any of the higher level types.

In High-level message types, we map the name of a given message type to the high-level routine used to encode/decode the type.

Table 1. High-level message types
High Level Type Framing Comment

node_alias

A 32-byte fixed-length byte slice.

When decoding, reject if contents are not a valid UTF-8 string.

channel_id

A 32-byte fixed-length byte slice that maps an outpoint to a 32 byte value.

Given an outpoint, one can convert it to a channel_id by taking the txid of the outpoint and XOR’ing it with the index (interpreted as the lower 2 bytes).

short_chan_id

An unsigned 64-bit integer (uint64)

Composed of the block height (24 bits), transaction index (24 bits), and output index (16 bits) packed into 8 bytes.

milli_satoshi

An unsigned 64-bit integer (uint64)

Represents 1000th of a satoshi.

satoshi

An unsigned 64-bit integer (uint64)

The based unit of bitcoin.

satoshi

An unsigned 64-bit integer (uint64)

The based unit of bitcoin.

pubkey

An secp256k1 public key encoded in compressed format, occupying 33 bytes.

Occupies a fixed 33-byte length on the wire.

sig

An ECDSA signature of the secp256k1 Elliptic Curve.

Encoded as a fixed 64-byte byte slice, packed as R || S

uint8

An 8-bit integer.

uint16

A 16-bit integer.

uint64

A 64-bit integer.

[]byte

A variable length byte slice.

Prefixed with a 16-bit integer denoting the length of the bytes.

color_rgb

RGB color encoding.

Encoded as a series if 8-bit integers.

net_addr

The encoding of a network address.

Encoded with a 1 byte prefix that denotes the type of address, followed by the address body.

In the next section, we describe the structure of each of the wire messages including the prefix type of the message along with the contents of its message body.

Type Length Value (TLV) Message Extensions

Earlier in this chapter we mentioned that messages can be up to 65 KB in size, and if while parsing a messages, extra bytes are left over, then those bytes are to be ignored. At an initial glance, this requirement may appear to be somewhat arbitrary, however this requirement allows for de-coupled de-synchronized evolution of the Lighting Protocol itself. We discuss this more towards the end of the chapter. But first, we turn our attention to exactly what those "extra bytes" at the end of a message can be used for.

The Protocol Buffer Message Format

The Protocol Buffer (Protobuf) message serialization format started out as an internal format used at Google, and has blossomed into one of the most popular message serialization formats used by developers globally. The Protobuf format describes how a message (usually some sort of data structure related to an API) is encoded on the wire and decoded on the other end. Several "Protobuf compilers" exists in dozens of languages which act as a bridge that allows any language to encode a Protobuf that will be able to decode by a compliant decode in another language. Such cross language data structure compatibility allows for a wide range of innovation, because it’s possible to transmit structure and even typed data structures across language and abstraction boundaries.

Protobufs are also known for their flexibility with respect to how they handle changes in the underlying messages structure. As long as the field numbering schema is adhered to, then it’s possible for a newer write of Protobufs to include information within a Protobuf that may be unknown to any older readers. When the old reader encounters the new serialized format, if there’re types/fields that it doesn’t understand, then it simply ignores them. This allows old clients and new clients to co-exist, as all clients can parse some portion of the newer message format.

Forwards & Backwards Compatibility

Protobufs are extremely popular amongst developers as they have built in support for both forwards and backwards compatibility. Most developers are likely familiar with the concept of backwards computability. In simple terms, the principles states that any changes to a message format or API should be done in a manner that doesn’t break support for older clients. Within our Protobuf extensibility examples above, backwards computability is achieved by ensuring that new additions to the proto format don’t break the known portions of older readers. Forwards computability on the other hand is just as important for de-synchronized updates however it’s less commonly known. For a change to be forwards compatible, then clients are to simply ignore any information they don’t understand. The soft for mechanism of upgrading the Bitcoin consensus system can be said to be both forwards and backwards compatible: any clients that don’t update can still use Bitcoin, and if they encounters any transactions they don’t understand, then they simply ignore them as their funds aren’t using those new features.

Type-Length-Value (TLV) Format

In order to be able to upgrade messages in both a forwards and backwards compatible manner, in addition to feature bits (more on that later), the LN utilizes a custom message serialization format plainly called: Type Length Value, or TLV for short. The format was inspired by the widely used Protobuf format and borrows many concepts by significantly simplifying the implementation as well as the software that interacts with message parsing. A curious reader might ask "why not just use Protobufs"? In response, the Lighting developers would respond that we’re able to have the best of the extensibility of Protobufs while also having the benefit of a smaller implementation and thus smaller attack. As of version v3.15.6, the Protobuf compiler weighs in at over 656,671 lines of code. In comparison LND’s implementation of the TLV message format weighs in at only 2.3k lines of code (including tests).

With the necessary background presented, we’re now ready to describe the TLV format in detail. A TLV message extension is said to be a stream of individual TLV records. A single TLV record has three components: the type of the record, the length of the record, and finally the opaque value of the record:

  • type: An integer representing the name of the record being encoded.

  • length: The length of the record.

  • value: The opaque value of the record.

Both the type and length are encoded using a variable sized integer that’s inspired by the variable sized integer (varint) used in Bitcoin’s p2p protocol, called BigSize for short.

BigSize Integer Encoding

In its fullest form, a BigSize integer can represent value up to 64-bits. In contrast to Bitcoin’s varint format, the BigSize format instead encodes integers using a big-endian byte ordering.

The BigSize varint has the components: the discriminant and the body. In the context of the BigSize integer, the discriminant communicates to the decoder the size of the variable size integer that follows. Remember that the unique thing about variable sized integers is that they allow a parser to use fewer bytes to encode smaller integers than larger ones, saving space. Encoding of a BigSize integer follows one of the four following options:

  1. If the value is less than 0xfd (253): Then the discriminant isn’t really used, and the encoding is simply the integer itself. This allows us to encode very small integers with no additional overhead.

  2. If the value is less than or equal to 0xffff (65535):The discriminant is encoded as 0xfd, which indicates that the value that follows is larger than 0xfd, but smaller than 0xffff). The number is then encoded as a 16-bit integer. Including the discriminant, then we can encode a value that is greater than 253, but less than 65535 using 3 bytes.

  3. If the value is less than 0xffffffff (4294967295): The discriminant is encoded as 0xfe. The body is encoded using 32-bit integer, Including the discriminant, then we can encode a value that’s less than 4,294,967,295 using 5 bytes.

  4. Otherwise, we just encode the value as a full-size 64-bit integer.

TLV encoding constraints

Within the context of a TLV message, record types below 2^16 are said to be reserved for future use. Types beyond this range are to be used for "custom" message extensions used by higher-level application protocols.

The value of a record depends on the type. In other words, it can take any form as parsers will attempt to interpret it depending on the context of the type itself.

TLV canonical encoding

One issue with the Protobuf format is that encodings of the same message may output an entirely different set of bytes when encoded by two different versions of the compiler. Such instances of a non-canonical encoding are not acceptable within the context of Lighting, as many messages contain a signature of the message digest. If it’s possible for a message to be encoded in two different ways, then it would be possible to break the authentication of a signature inadvertently by re-encoding a message using a slightly different set of bytes on the wire.

In order to ensure that all encoded messages are canonical, the following constraints are defined when encoding:

  • All records within a TLV stream MUST be encoded in order of strictly increasing type.

  • All records must minimally encode the type and length fields. In other words, the smallest BigSize representation for an integer MUST be used at all times.

  • Each type may only appear once within a given TLV stream.

In addition to these encoding constraints, a series of higher-level interpretation requirements are also defined based on the arity of a given type integer. we dive further into these details towards the end of the chapter once we describe how the Lighting Protocol is upgraded in practice and in theory.

Feature Bits & Protocol Extensibility

As the Lighting Network is a decentralized system, no single entity can enforce a protocol change or modification upon all the users of the system. This characteristic is also seen in other decentralized networks such as Bitcoin. However, unlike Bitcoin overwhelming consensus is not required to change a subset of the Lightning Network. Lighting is able to evolve at will without a strong requirement of coordination, as unlike Bitcoin, there is no global consensus required in the Lightning Network. Due to this fact and the several upgrade mechanisms embedded in the Lighting Network, only the participants that wish to use these new Lighting Network features need to upgrade, and then they are able to interact with each other.

In this section, we explore the various ways that developers and users are able to design and deploy new features to the Lightning Network. The designers of the original Lightning Network knew that there were many possible future directions for the network and the underlying protocol. As a result, they made sure to implement several extensibility mechanisms within the system, which can be used to upgrade it partially or fully in a decoupled, desynchronized, and decentralized manner.

Feature Bits as an Upgrade Discoverability Mechanism

An astute reader may have noticed the various locations that "feature bits" are included within the Lightning Protocol. A "feature bit" is a bitfield that can be used to advertise understanding or adherence to a possible network protocol update. Feature bits are commonly assigned in pairs, meaning that each potential new feature/upgrade always defines two bits within the bitfield. One bit signals that the advertised feature is optional, meaning that the node knows about the feature and can use it, but doesn’t consider it required for normal operation. The other bit signals that the feature is instead required, meaning that the node will not continue operation if a prospective peer doesn’t understand that feature.

Using these two bits (optional and required), we can construct a simple compatibility matrix that nodes/users can consult in order to determine if a peer is compatible with a desired feature:

Table 2. Feature Bit Compatibility Matrix
Bit Type Remote Optional Remote Required Remote Unknown

Local Optional

Local Required

Local Unknown

From this simplified compatibility matrix, we can see that as long as the other party knows about our feature bit, then we can interact with them using the protocol. If the party doesn’t even know about what bit we’re referring to and they require the feature, then we are incompatible with them. Within the network, optional features are signaled using an odd bit number while required feature are signaled using an even bit number. As an example, if a peer signals that they known of a feature that uses bit 15, then we know that this is an optional feature, and we can interact with them or respond to their messages even if we don’t know about the feature. If they instead signaled the feature using bit 16, then we know this is a required feature, and we can’t interact with them unless our node also understands that feature.

The Lighting developers have come up with an easy to remember phrase that encodes this matrix: "it’s OK to be odd". This simple rule allows for a rich set of interactions within the protocol, as a simple bitmask operation between two feature bit vectors allows peers to determine if certain interactions are compatible with each other or not. In other words, feature bits are used as an upgrade discoverability mechanism: they easily allow to peers to understand if they are compatible or not based on the concepts of optional, required, and unknown feature bits.

Feature bits are found in the: node_announcement, channel_announcement, and init messages within the protocol. As a result, these three messages can be used to signal the knowledge and/or understanding of in-flight protocol updates within the network. The feature bits found in the node_announcement message can allow a peer to determine if their connections are compatible or not. The feature bits within the channel_announcement messages allows a peer to determine if a given payment type or HTLC can transit through a given peer or not. The feature bits within the init message allow peers to understand if they can maintain a connection, and also which features are negotiated for the lifetime of a given connection.

TLV for forwards & backwards compatibility

As we learned earlier in the chapter, Type-Length-Value, or TLV records can be used to extend messages in a forwards and backwards compatible manner. Overtime, these records have been used to extend existing messages without breaking the protocol by utilizing the "undefined" area within a message beyond that set of known bytes.

As an example, the original Lighting Protocol didn’t have a concept of the "largest amount HTLC" that could traverse through a channel as dictated by a routing policy. Later on, the max_htlc field was added to the channel_update message to phase-in this concept over time. Peers that received a channel_update that set such a field but didn’t even know the upgrade existed where unaffected by the change, but have their HTLCs rejected if they are beyond the limit. Newer peers, on the other hand, are able to parse, verify, and utilize the new field.

Those familiar with the concept of soft-forks in Bitcoin may now see some similarities between the two mechanisms. Unlike Bitcoin consensus-level soft-forks, upgrades to the Lighting Network don’t require overwhelming consensus in order to adopt. Instead, at minimum, only two peers within the network need to understand a new upgrade in order to start using it. Commonly these two peers may be the recipient and sender of a payment, or may be the channel partners of a new payment channel.

A taxonomy of upgrade mechanisms

Rather than there being a single widely utilized upgrade mechanism within the network (such as soft-forks for Bitcoin), there exist several possible upgrade mechanisms within the Lighting Network. In this section, we enumerate these upgrade mechanisms, and provide a real-world example of their use in the past.

Internal Network Upgrades

We start with the upgrade type that requires the most protocol-level coordination: internal network upgrades. An internal network upgrade is characterized by one that requires every single node within a prospective payment path to understand the new feature. Such an upgrade is similar to any upgrade within the internet that requires hardware-level upgrades within the core-relay portion of the upgrade. In the context of LN however, we deal with pure software, so such upgrades are easier to deploy, yet they still require much more coordination than any other upgrade mechanism in the network.

One example of such an upgrade within the network was the introduction of a TLV encoding for the routing information encoded within the onion packets. The prior format used a hard-coded fixed-length message format to communicate information such as the next hop. As this format was fixed it meant that new protocol-level upgrades weren’t possible. The move to the more flexible TLV format meant that after this upgrade, any sort of feature that modified the type of information communicated at each hop could be rolled out at will.

It’s worth mentioning that the TLV onion upgrade was a sort of "soft" internal network upgrade, in that if a payment wasn’t using any new feature beyond that new routing information encoding, then a payment could be transmitted using a mixed set of nodes.

End-to-End Upgrades

To contrast the internal network upgrade, in this section we describe the end to end network upgrade. This upgrade mechanism differs from the internal network upgrade in that it only requires the "ends" of the payment, the sender and recipient to upgrade.

This type of upgrade allows for a wide array of unrestricted innovation within the network. Because of the onion encrypted nature of payments within the network, those forwarding HTLCs within the center of the network may not even know that new features are being utilized.

One example of an end-to-end upgrade within the network was the roll-out of multi-part payments (MPP). MPP is a protocol-level feature that enables a single payment to be split into multiple parts or paths, to be assembled at the recipient for settlement. The roll out our MPP was coupled with a new node_announcement level feature bit that indicates that the recipient knows how to handle partial payments. Assuming a sender and recipient know about each other (possibly via a BOLT 11 invoice), then they’re able to use the new feature without any further negotiation.

Another example of an end-to-end upgrade are the various types of "spontaneous" payments deployed within the network. One early type of spontaneous payments called keysend worked by simply placing the pre-image of a payment within the encrypted onion. Upon receipt, the destination would decrypt the pre-image, then use that to settle the payment. As the entire packet is end-to-end encrypted, this payment type was safe, since none of the intermediate nodes are able to fully unwrap the onion to uncover the payment pre-image.

Channel Construction Level Updates

The final broad category of updates are those that happen at the channel construction level, but which don’t modify the structure of the HTLC used widely within the network. When we say channel construction, we mean how the channel is funded or created. As an example, the eltoo channel type can be rolled out within the network using a new node_announcement level feature bit as well as a channel_announcement level feature bit. Only the two peers on the sides of the channels needs to understand and advertise these new features. This channel pair can then be used to forward any payment type granted the channel supports it.

Another is the "anchor outputs" channel format which allows the commitment fee to be bumped via Bitcoin’s Child-Pays-For-Parent (CPFP) fee management mechanism.

Conclusion

Lightning’s wire protocol is incredibly flexible and allows for rapid innovation and interoperability without strict consensus. It is one of the reasons that the Lightning Network is experiencing much faster development and is attractive to many developers, who might otherwise find Bitcoin’s development style too conservative and slow.