Skip to content

Latest commit

 

History

History
49 lines (34 loc) · 2.37 KB

README.md

File metadata and controls

49 lines (34 loc) · 2.37 KB

README: serdes

Brief

Serialisation and deserialisation facility for structured data

Detailed description

Why serialize data?

  • Interoperability
    • Across programming languages
    • Across CPU architectures (eg: endianness, memory alignment)
    • Across data representation formats (eg: JSON, XML, CBOR)
  • Transport over IPC mechanisms
  • Persistent storage and retrieval

Requirements

In decreasing order of priority:

  • Interoperable across C++ and Python [Note 0]
  • Fast encoding and decoding
  • Usable without a schema [Note 1]
  • Usable with no dynamic memory allocation atleast in encoding
  • Simple API, even for complex data structures composed from other data structures. Ideally, a free function to encode and to decode as follows:
    template<typename SerialiserType, typename DataType>
    void serialize(SerialiserType&, const DataType&)
    
    template<typename SerialiserType, typename DataType>
    void deserialize(SerialiserType&, DataType&)  
  • Good documentation (spec., usage)
  • Error handling preferably as return codes instead of exceptions
  • Suitable for resource-constrained systems (eg: bare metal systems)
  • Minimal additional software dependencies
  • (Optional) Self-describing [Note 2] (i.e. meta information is contained within data)

[Note 0] Use-case limited to using Python-based tools to analyse data generated by C++ applications

[Note 1] Schema-based methods require pre-processing IDLs to autogenerate code. This is an abomination. I want to write native code that meets project-specific language and coding guidelines. However, schemaless encoding does require encoding and decoding the elements in the data structures manually. This is a minor inconvenience. The incidental complexity of integrating schemas and a mechanism to autogenerate and integrate non-conforming code is a compromise I am not willing to make.

[Note 2] Enables type-safe deserialisation, introspection (as text or JSON, for instance), version control and backwards compatibility. But, this comes at the cost of reduced performance and increased data sizes

Development

A few third party libraries were considered initially. Benchmarking showed that nothing beats a simple data packing/unpacking scheme. So, that's what is implemented. Additionally, the first-principles approach meant that every single requirement stated previously could be met.