Authenticated Attributes is built on top of Hyperbee, a key-value store with BitTorrent-esque replication. Authenticated Attributes is also a key-value store, but every value is signed, timestamped, and optionally encrypted. Our schema is designed around CIDs and making "attestations about media", rather than just generic key-value storage.
The key is the CID of the asset, followed by a slash, followed by the name of the attestation. All keys are prefixed by a type. An example for an attestation is:
att/bafkreif7gtpfl7dwi5nflge2rsfp6vq6q5kkwfm7uvxyyezxhsnde5ly3y/description
The value is described below.
Database entries are stored as binary data, encoded with DAG-CBOR. This is like CBOR, but has canonical encoding and native support for CIDs. If you don't know CBOR, it's like JSON but binary. This allows for easy storage of binary data alongside any other types.
{
version: "1.0",
signature: {
pubKey: Uint8Array(32),
sig: Uint8Array(64),
// CID of "attestation" object
msg: CID(bafyreietqpflteqz6kj7lmdqz76kzkwdo65o4bhivxrmqvha7pdgixxos4)
},
timestamp: {
ots: { // OpenTimestamps
proof: Uint8Array(503),
upgraded: false,
// CID of signature and attestation objects together in a map:
// {signature, attestation}
msg: CID(bafyreialprnoiwl25t37feen7wbkwwr4l5bpnokjydkog3mhiuodi2av6m)
}
// Possible other timestamp formats in the future
},
attestation: {
// CID of asset file, same CID as in the database key
CID: CID(bafkreif7gtpfl7dwi5nflge2rsfp6vq6q5kkwfm7uvxyyezxhsnde5ly3y),
value: 'Web archive foo bar',
attribute: 'description',
encrypted: false,
timestamp: '2023-05-29T19:03:28.601Z'
}
}
The binary data of timestamp.ots.proof
does not have a specified size, the size mentioned above is just an example and may vary in some cases.
When CID(...)
is shown that represents a CID stored natively, not as text. Thanks to the DAG-CBOR encoding we are able to do this. We are also able to get the CID of non-files such as particular DAG-CBOR objects. This is what allows the usage of CIDs for signature.msg
and timestamp.ots.msg
.
Some information already in the database key is repeated in the attestation
, such as CID
and attribute
. This allows for export of the whole object for external verification and use elsewhere.
When the attestation is encrypted, the schema looks very similar to the above. The only change is attestation.encrypted
is true
, and attestation.value
is always binary data. That binary data, once decrypted, is a DAG-CBOR encoding of whatever the original value was: object, binary data, string, integer, etc.
Currently only a version
of 1.0
is supported. In the past, debug databases had no version
field and that is considered equivalent to 1.0
. Future non-breaking changes will only update the minor version, after the dot.
For more information on specific kinds of attestation, or other types of key-value pairs stored in the database, please see schema.md.
Every attestation stored in the database is signed with an ed25519 keypair. The private key can be loaded from a PEM file such as those generated by openssl
, or directly from a 32-byte Buffer
.
An ed25519 private key can be generated with the command openssl genpkey -algorithm ED25519
.
Attestations can optionally be encrypted on a per-attestation basis. Symmetric encryption is used, so a single secret key needs to be generated for encryption. This can just be a Buffer
of 32 random bytes.
The NaCl API is used, so the specific encryption algorithm is xsalsa20-poly1305. The nonce is prepended before storing.
Attestations are timestamped with OpenTimestamps. This requires Internet access and takes about one second to finish. At first only the incomplete proof is stored (indicated by timestamp.ots.upgraded
being false
), but the proof can be upgraded at a later date.
The timestamp serves to prove that the attestation was not made after attestation.timestamp
, within the several hours long error bars afforded by the system. In practice, this means attestation.timestamp
is provably accurate to about a day interval.
If you trust the signer you can ignore the proof and rely on attestation.timestamp
alone, making it accurate to about a second.
Modern trusted timestamping methods usually fall into two groups. Centralized timestamping requires trusting a central authority that will sign your data (or a hash) with a timestamp. Decentralized timestamping requires inserting your data (or a hash) into a timestamped widely-copied database, such as a blockchain or a printed newspaper.
This repo currently uses decentralized timestamping via OpenTimestamps, which uses the Bitcoin blockchain and could support other blockchains in the future.
There is an existing standard for centralized timestamping, RFC 3161, but it isn't used here due to the large proof size that would be need to be stored for each attestation. See this issue for more details.
There are two ways of accessing the database: using our Node.js library on local files, or over HTTP. In the ideal world, prospective readers would take advantage of Hyperbee's sparse downloading and partially clone the database over the P2P network, then start making queries using our code. In practice, the HTTP API is likely to see more use as it's simpler and will work in browsers. It also is the only way to do remote writes.
Public functions are documented in lib.md. You can see some example usage in files like demo.js, demo-get.js, and siblings. The actual library source code is easy to read and is all in the src folder.
Please see the documentation for this in http.md. The source code for the server is available in server.js.