Serialisation and deserialisation facility for structured data
- Interoperability
- Across programming languages
- Across CPU architectures (eg: endianness, memory alignment)
- Across data representation formats (eg: JSON, XML, CBOR)
- Transport over IPC mechanisms
- Persistent storage and retrieval
In decreasing order of priority:
- Interoperable across C++ and Python [Note 0]
- Fast encoding and decoding
- Usable without a schema [Note 1]
- Usable with no dynamic memory allocation atleast in encoding
- Simple API, even for complex data structures composed from other data structures. Ideally, a free function to encode and to decode as follows:
template<typename SerialiserType, typename DataType> void serialize(SerialiserType&, const DataType&) template<typename SerialiserType, typename DataType> void deserialize(SerialiserType&, DataType&)
- Good documentation (spec., usage)
- Error handling preferably as return codes instead of exceptions
- Suitable for resource-constrained systems (eg: bare metal systems)
- Minimal additional software dependencies
- (Optional) Self-describing [Note 2] (i.e. meta information is contained within data)
[Note 0] Use-case limited to using Python-based tools to analyse data generated by C++ applications
[Note 1] Schema-based methods require pre-processing IDLs to autogenerate code. This is an abomination. I want to write native code that meets project-specific language and coding guidelines. However, schemaless encoding does require encoding and decoding the elements in the data structures manually. This is a minor inconvenience. The incidental complexity of integrating schemas and a mechanism to autogenerate and integrate non-conforming code is a compromise I am not willing to make.
[Note 2] Enables type-safe deserialisation, introspection (as text or JSON, for instance), version control and backwards compatibility. But, this comes at the cost of reduced performance and increased data sizes
A few third party libraries were considered initially. Benchmarking showed that nothing beats a simple data packing/unpacking scheme. So, that's what is implemented. Additionally, the first-principles approach meant that every single requirement stated previously could be met.