Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(backup): serialize the backup header [WPB-10575] #3159

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions backup/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Its purpose is to create a common implementation to be used by iOS, Web, and And
## Capabilities

> [!TIP]
> The backup blob/file will be referred in this document as **backup artifact**, or simply
> The backup blob/file will be referred in this document as **backup artifact**, or simply
> **artifact**.
> The clients (iOS, Web, and Android) will be referred as **callers**.

Expand Down Expand Up @@ -54,4 +54,35 @@ artifacts (not backup artifacts) using the following Gradle tasks:
- iOS: `./gradlew :backup:assembleBackupDebugXCFramework`
- Web: `./gradlew :backup:jsBrowserDevelopmentLibraryDistribution`

**Output:** the results will be in `backup/build` directory. iOS needs the whole `backup.xcframework` directory, Web/JS needs the whole directory that contains `package.json`
**Output:** the results will be in `backup/build` directory. iOS needs the whole
`backup.xcframework` directory, Web/JS needs the whole directory that contains `package.json`

--------

# The artifact format

The table below represents the format of the backup file.
The first 1024 bytes are reserved for the backup header. Lots of blank space is left for future
proofing in case we want to add more optional fields.
The following 24 bytes are reserved for
the [xChaCha20Poly1305 encryption header](https://libsodium.gitbook.io/doc/secret-key_cryptography/secretstream#usage).
If the archive is not encrypted, this will be filled with 0x00.
The remaining of the file stores the actual backed up data, be it encrypted or not.
Big endian is used.

| Index | Name | Length | Value | Description |
|-------------------|------------------|------------------|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| (Start of Header) | | | | |
| 0 | fileMagicNumber | 4 | 0x57 0x42 0x55 0x58 | [Magic number](https://en.wikipedia.org/wiki/File_format#Magic_number) to distinguish our file format. Last X letter denotes this backup is supported across different platforms, as opposed to previous Backup versions. X = Cross |
| 4 | | 1 | 0x00 | Empty bit. Non-readable value so that the file is not identified as a text-file by most software. |
| 5 | formatVersion | 2 | Unsigned Short | Version of the file format. For example: `0x00 0x04` for version 4. Should be bumped when there are breaking changes in the format |
| 7 | hashSalt | 16 | Blob of bytes | Salt for argon2 key derivation. Used for hashing UserID (author of this file) and to spice the user-created password (if the user chooses to encrypt the archive) |
| 23 | hashedUserId | 32 | Blob of bytes | The hashed ID of the user that authored this file. |
| 55 | hashOpsLimit | 4 | Unsigned Integer | [opsLimit](https://libsodium.gitbook.io/doc/password_hashing/default_phf#key-derivation) for hashing |
| 59 | hashMemLimit | 4 | Unsigned Integer | [memLimit](https://libsodium.gitbook.io/doc/password_hashing/default_phf#key-derivation) for hashing |
| 63 | isEncrypted | 1 | Boolean | Is the file encrypted? `0x00 = false`, anything else is true. If not encrypted, the xChaCha20 header can be ignored and the archive can be read straight away without any decryption or asking the user for a password. |
| 64 | | 960 | Empty (0x00) bytes. Reserved space | For future proofing. If we choose to add more metadata to the file and that shouldn't break backwards compatibility, we can add here. Otherwise we need to bump the `formatVersion` field |
| (End of header) | | | | |
| 1024 | encryptionHeader | 24 | Blob of bytes | [xChaCha20Poly1305 encryption header](https://libsodium.gitbook.io/doc/secret-key_cryptography/secretstream#usage), used by libsodium to decrypt the rest of the file. Should be filled with zeroed-bytes if `isEncrypted` is false |
| 1048 | backedUpData | Rest of the file | The actual meat | The backed up data, be it encrypted or not. If encrypted, should be decrypted before attempting to read it. |

Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
/*
* Wire
* Copyright (C) 2024 Wire Swiss GmbH
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see http://www.gnu.org/licenses/.
*/
@file:OptIn(ExperimentalUnsignedTypes::class)

package com.wire.backup.envelope.header

import com.ionspin.kotlin.crypto.pwhash.crypto_pwhash_MEMLIMIT_MIN

/**
* The unencrypted data we write on the beginning of the backup files.
*
*/
internal data class BackupHeader(
val version: Int,
val isEncrypted: Boolean,
val hashData: HashData
)

internal data class HashData(
/**
* The hashed ID of the user that originally created this backup.
* This hash is calculated using Argon2, with this [salt], [operationsLimit] and [hashingMemoryLimit].
* This array is [HASHED_USER_ID_SIZE_IN_BYTES] long.
*/
val hashedUserId: UByteArray,

/**
* The salt used in order to create the [hashedUserId] and used to derivate the encryption password to read/write the encrypted archive.
* This array is [SALT_SIZE_IN_BYTES] long.
*/
val salt: UByteArray,

/**
* Represents the maximum amount of computations to perform.
* Raising this number will make the function require more CPU cycles to compute a key.
* See [Libsodium's Documentation](https://libsodium.gitbook.io/doc/password_hashing/default_phf#key-derivation).
*/
val operationsLimit: UInt,

/**
* Memory used by the hashing algorithm.
* See [Libsodium's Documentation](https://libsodium.gitbook.io/doc/password_hashing/default_phf#key-derivation).
* This value has to be bigger than [crypto_pwhash_MEMLIMIT_MIN].
*/
val hashingMemoryLimit: UInt
) {
init {
require(hashedUserId.size == HASHED_USER_ID_SIZE_IN_BYTES) {
"Hashed user ID has to be $HASHED_USER_ID_SIZE_IN_BYTES bytes long!"
}
require(salt.size == SALT_SIZE_IN_BYTES) { "Salt has to be $SALT_SIZE_IN_BYTES bytes long!" }
require(hashingMemoryLimit >= MINIMUM_MEMORY_LIMIT) {
"Memory Limit must be equal to or bigger than $MINIMUM_MEMORY_LIMIT!"
}
}

companion object {
const val HASHED_USER_ID_SIZE_IN_BYTES = 32
const val SALT_SIZE_IN_BYTES = 16
val MINIMUM_MEMORY_LIMIT = crypto_pwhash_MEMLIMIT_MIN.toUInt()
}

override fun equals(other: Any?): Boolean {
if (this === other) return true
if (other == null || this::class != other::class) return false

other as HashData

if (!hashedUserId.contentEquals(other.hashedUserId)) return false
if (!salt.contentEquals(other.salt)) return false
if (operationsLimit != other.operationsLimit) return false
if (hashingMemoryLimit != other.hashingMemoryLimit) return false

return true
}

override fun hashCode(): Int {
var result = hashedUserId.contentHashCode()
result = 31 * result + salt.contentHashCode()
result = 31 * result + operationsLimit.hashCode()
result = 31 * result + hashingMemoryLimit.hashCode()
return result
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
/*
* Wire
* Copyright (C) 2024 Wire Swiss GmbH
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see http://www.gnu.org/licenses/.
*/
package com.wire.backup.envelope.header

import okio.Buffer

internal interface BackupHeaderField<Format : Any> {
val sizeInBytes: Long
fun read(input: Buffer): Format
fun write(data: Format, output: Buffer)

abstract class ArbitrarySize<Format : Any>(override val sizeInBytes: Long) : BackupHeaderField<Format> {

abstract fun fromBytes(bytes: ByteArray): Format
abstract fun toBytes(data: Format): ByteArray

override fun read(input: Buffer): Format = fromBytes(input.readByteArray(sizeInBytes))

override fun write(data: Format, output: Buffer) {
output.write(toBytes(data))
}
}

class String private constructor(sizeInBytes: Long) : ArbitrarySize<kotlin.String>(sizeInBytes) {
override fun toBytes(data: kotlin.String): ByteArray = data.encodeToByteArray()
override fun fromBytes(bytes: ByteArray): kotlin.String = bytes.decodeToString()

companion object {
private const val FORMAT_SIZE_IN_BYTES = 4L
val format = String(FORMAT_SIZE_IN_BYTES)
}
}

@OptIn(ExperimentalUnsignedTypes::class)
class UByteArray private constructor(sizeInBytes: Long) : ArbitrarySize<kotlin.UByteArray>(sizeInBytes) {
override fun fromBytes(bytes: ByteArray): kotlin.UByteArray = bytes.toUByteArray()
override fun toBytes(data: kotlin.UByteArray): ByteArray = data.toByteArray()

companion object {
val salt = UByteArray(HashData.SALT_SIZE_IN_BYTES.toLong())
val hashedUserId = UByteArray(HashData.HASHED_USER_ID_SIZE_IN_BYTES.toLong())
}
}

class Boolean private constructor() : BackupHeaderField<kotlin.Boolean> {
override val sizeInBytes: Long
get() = 1L

override fun read(input: Buffer): kotlin.Boolean = input.readByte() != 0x00.toByte()

override fun write(data: kotlin.Boolean, output: Buffer) {
output.writeByte(if (data) 0x01 else 0x00)
}

companion object {
val isEncrypted = Boolean()
}
}

class UInt private constructor() : BackupHeaderField<kotlin.UInt> {
override val sizeInBytes: Long
get() = SIZE_IN_BYTES

override fun read(input: Buffer): kotlin.UInt = input.readInt().toUInt()

override fun write(data: kotlin.UInt, output: Buffer) {
output.writeInt(data.toInt())
}

companion object {
val opsLimit = UInt()
val memLimit = UInt()

/**
* Amount of bytes used by an unsigned Integer when reading/writing to file
*/
private const val SIZE_IN_BYTES = 4L
}
}

class UShort private constructor() : BackupHeaderField<kotlin.UShort> {
override val sizeInBytes: Long
get() = SIZE_IN_BYTES

override fun read(input: Buffer): kotlin.UShort = input.readShort().toUShort()

override fun write(data: kotlin.UShort, output: Buffer) {
output.writeShort(data.toInt())
}

companion object {
val version = UShort()

/**
* Amount of bytes used by an unsigned Short when reading/writing to file
*/
private const val SIZE_IN_BYTES = 2L
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
/*
* Wire
* Copyright (C) 2024 Wire Swiss GmbH
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see http://www.gnu.org/licenses/.
*/
package com.wire.backup.envelope.header

import okio.Buffer
import okio.Source

/**
* Reads and writes a [BackupHeader] to data streams.
*
* See [the file specifications in backup/README.md](https://github.com/wireapp/kalium/blob/develop/backup/README.md)
*/
internal interface BackupHeaderSerializer {
/**
* Converts a [BackupHeader] into a byte buffer format, which can be stored in the beginning of a Backup file.
*/
fun headerToBytes(header: BackupHeader): Buffer

/**
* Consumes the first relevant bytes of the [source], parses and returns a [HeaderParseResult].
*/
fun parseHeader(source: Source): HeaderParseResult

companion object {
/**
* The total amount of bytes reserved for the header in the beginning of the file.
* Although the current fields occupy just around 100 bytes, we choose to reserve the first 1024 bytes for the header.
* This way we can add extra fields in the future without breaking the format and requiring a file format version bump.
*/
const val HEADER_SIZE = 1024L
}

object Default : BackupHeaderSerializer {
const val CURRENT_HEADER_VERSION = 4
private const val FORMAT_IDENTIFIER_MAGIC_NUMBER = "WBUX"
const val MINIMUM_SUPPORTED_VERSION = 4
const val MAXIMUM_SUPPORTED_VERSION = 4
val SUPPORTED_VERSIONS = MINIMUM_SUPPORTED_VERSION..MAXIMUM_SUPPORTED_VERSION

/**
* We leave an unreadable char in the beginning, so it isn't identified as a text-file by some software / OS
*/
private const val SIZE_OF_GAP_AFTER_FORMAT_FIELD = 1L

override fun headerToBytes(header: BackupHeader): Buffer {
val headerBytes = Buffer()
BackupHeaderField.String.format.write(FORMAT_IDENTIFIER_MAGIC_NUMBER, headerBytes)
repeat(SIZE_OF_GAP_AFTER_FORMAT_FIELD.toInt()) {
headerBytes.writeByte(0x00)
}
BackupHeaderField.UShort.version.write(header.version.toUShort(), headerBytes)
BackupHeaderField.UByteArray.salt.write(header.hashData.salt, headerBytes)
BackupHeaderField.UByteArray.hashedUserId.write(header.hashData.hashedUserId, headerBytes)
BackupHeaderField.UInt.opsLimit.write(header.hashData.operationsLimit, headerBytes)
BackupHeaderField.UInt.memLimit.write(header.hashData.hashingMemoryLimit, headerBytes)
BackupHeaderField.Boolean.isEncrypted.write(header.isEncrypted, headerBytes)

val remainingReservedSpaceSize = HEADER_SIZE - headerBytes.size
repeat(remainingReservedSpaceSize.toInt()) {
headerBytes.writeByte(0x00)
}

return headerBytes
}

override fun parseHeader(source: Source): HeaderParseResult {
val headerBytes = Buffer()
return if (source.read(headerBytes, HEADER_SIZE) != HEADER_SIZE) {
HeaderParseResult.Failure.UnknownFormat
} else {
val format = BackupHeaderField.String.format.read(headerBytes)
if (format != FORMAT_IDENTIFIER_MAGIC_NUMBER) return HeaderParseResult.Failure.UnknownFormat
headerBytes.skip(SIZE_OF_GAP_AFTER_FORMAT_FIELD)
val version = BackupHeaderField.UShort.version.read(headerBytes).toInt()
if (version !in SUPPORTED_VERSIONS) {
HeaderParseResult.Failure.UnsupportedVersion(version)
} else {
val salt = BackupHeaderField.UByteArray.salt.read(headerBytes)
val hashedUserId = BackupHeaderField.UByteArray.hashedUserId.read(headerBytes)
val opsLimit = BackupHeaderField.UInt.opsLimit.read(headerBytes)
val memLimit = BackupHeaderField.UInt.memLimit.read(headerBytes)
val isEncrypted = BackupHeaderField.Boolean.isEncrypted.read(headerBytes)

val hashData = HashData(hashedUserId, salt, opsLimit, memLimit)
val header = BackupHeader(version, isEncrypted, hashData)
HeaderParseResult.Success(header)
}
}
}
}

}

internal sealed interface HeaderParseResult {
data class Success(val header: BackupHeader) : HeaderParseResult
sealed interface Failure : HeaderParseResult {
/**
* The file does not follow the expected format, by not starting with the correct magical numbers, or not having
* the minimum expected size.
*/
data object UnknownFormat : Failure

/**
* The [version] found in the backup is not supported. Either too old, or too new.
*/
data class UnsupportedVersion(val version: Int) : Failure
}
}
Loading
Loading