diff --git a/doc/associations.md b/doc/associations.md new file mode 100644 index 000000000..68717a912 --- /dev/null +++ b/doc/associations.md @@ -0,0 +1,233 @@ +# Associationg unrelated objects with each other +Sometimes it is necessary to build relations between objects whose datatypes are +not related via a `OneToOneRelation` or a `OneToManyRelation`. These *external +relations* are called *Associations* in podio, and they are implemented as a +templated version of the code that would be generated by the following yaml +snippet (in this case between generic `FromT` and `ToT` datatypes): + +```yaml +Association: + Description: "A weighted association between a FromT and a ToT" + Author: "P. O. Dio" + Members: + - float weight // the weight of the association + OneToOneRelations: + - FromT from // reference to the FromT + - ToT to // reference to the ToT +``` + +## `Association` basics +`Association`s are implemented as templated classes forming a similar structure +as other podio generated classes, with several layers of which users only ever +interact with the *User layer*. This layer has the following basic classes +```cpp +/// The collection class that forms the basis of I/O and also is the main entry point +template +class AssociationCollection; + +/// The default (immutable) class that one gets after reading a collection +template +class Association; + +/// The mutable class for creating associations before writing them +template +class MutableAssociation; +``` + +Although the names of the template parameters, `FromT` and `ToT` imply a +direction of the association, from a technical point of view nothing actually +enforces this direction, unless `FromT` and `ToT` are both of the same type. +Hence, associations can effectively be treated as bi-directional, and one +combination of `FromT` and `ToT` should be enough for all use cases (see also +the [usage section](#how-to-use-associations)). + +For a more detailed explanation of the internals and the actual implementation +see [the implementation details](#implementation-details). + +## How to use `Association`s +Using `Association`s is quite simple. In line with other datatypes that are +generated by podio all the functionality can be gained by including the +corresponding `Collection` header. After that it is generally recommended to +introduce a type alias for easier usage. **As a general rule `Associations` need +to be declared with the default (immutable) types.** Trying to instantiate them +with `Mutable` types will result in a compilation error. + +```cpp +#include "podio/AssociationCollection.h" + +#include "edm4hep/MCParticleCollection.h" +#include "edm4hep/ReconstructedParticleCollection.h" + +// declare a new assocation type +using MCRecoParticleAssociationCollection = podio::AssociationCollection; +``` + +This can now be used exactly as any other podio generated collection, i.e. +```cpp +edm4hep::MCParticle mcParticle{}; +edm4hep::ReconstructedParticle recoParticle{}; + +auto mcRecoAssocs = MCRecoParticleAssociationCollection{}; +auto assoc = mcRecoAssocs.create(); // create an association; +assoc.setFrom(mcParticle); +assoc.setTo(recoParticle); +assoc.setWeight(1.0); // This is also the default value! +``` + +and similar for getting the associated objects +```cpp +auto mcP = assoc.getFrom(); +auto recoP = assoc.getTo(); +auto weight = assoc.getWeight(); +``` + +In the above examples the `From` and `To` in the method names imply a direction, +but it is also possible to use a templated `get` and `set` method to retrieve +the associated objects via their type: + +```cpp +assoc.set(mcParticle); +assoc.set(recoParticle); + +auto mcP = assoc.get(); +auto recoP = assoc.get(); +auto weight = assoc.getWeight(); +``` + +It is also possible to access the elments of an association via an index based +`get` (similar to `std::tuple`). In this case `0` corresponds to `getFrom`, `1` +corresponds to `getTo` and `2` corresponds to the weight. The main purpose of +this feature is to enable structured bindings: + +```cpp +const auto& [mcP, recoP, weight] = assoc; +``` + +The above three examples are three equivalent ways of retrieving the same things +from an `Association`. **The templated `get` and `set` methods are only availble +if `FromT` and `ToT` are not the same type** and will lead to a compilation +error otherwise. + +### Enabling I/O capabilities for `Association`s + +`Association`s do not have I/O support out of the box. This has to be enabled via +the `PODIO_DECLARE_ASSOCIATION` macro (defined in the `AssociationCollection.h` +header). If you simply want to be able to read / write `Association`s in a +standalone executable, it is enough to use this macro somewhere in the +executable, e.g. to enable I/O capabilities for the `MCRecoParticleAssociation`s +used above this would look like: + +```cpp +PODIO_DECLARE_ASSOCIATION(edm4hep::MCParticle, edm4hep::ReconstructedParticle) +``` + +The macro will also enable SIO support if the `PODIO_ENABLE_SIO=1` is passed to +the compiler. This is done by default when linking against the +`podio::podioSioIO` library in CMake. + +For enabling I/O support for shared datamodel libraries, it is necessary to have +all the necessary combinations of types declared via `PODIO_DECLARE_ASSOCIATION` +and have that compiled into the library. This is necessary if you want to use +the python bindings, since they rely on dynamically loading the datamodel +libraries. + +## Implementation details + +In order to give a slightly easier entry to the details of the implementation +and also to make it easier to find where things in the generated documentation, +we give a brief description of the main ideas and design choices here. With +those it should be possible to dive deeper if necessary or to understand the +template structure that is visible in the documentation, but should be fairly +invisible in usage. We will focus mainly on the user facing classes, as those +deal with the most complexity, the underlying layers are more or less what could +be obtained by generating them via the yaml snippet above and sprinkling some +`` templates where necessary. + +### File structure + +The user facing `"podio/AssociationCollection.h"` header essentially just +defines the `PODIO_DECLARE_ASSOCIATION` macro (depending on whether SIO support +is desired and possible or not). All the actual implementation is done in the +following files: + +- [`"podio/detail/AssociationCollectionImpl.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationCollectionImpl.h): + for the collection functionality +- [`"podio/detail/Association.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/Association.h): + for the functionality of single association +- [`"podio/detail/AssociationCollectionIterator.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationCollectionIterator.h): + for the collection iterator functionality +- [`"podio/detail/AssociationObj.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationObj.h): + for the object layer functionality + - [`"podio/detail/AssociationCollectionData.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationCollectionData.h): + for the collection data functionality +- [`"podio/detail/AssociationFwd.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationFwd.h): + for some type helper functionality and some forward declarations that are used + throughout the othe headers +- [`"podio/detail/AssociationSIOBlock.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/AssociationSIOBlock.h): + for defining the SIOBlocks that are necessary to use SIO + +As is visible from this structure, we did not introduce an `AssociationData` +class, since that would effectively just be a `float` wrapped inside a `struct`. + +### Default and `Mutable` `Association` classes + +A quick look into the `AssociationFwd.h` header will reveal that the default and +`Mutable` `Association` classes are in fact just partial specialization of the +`AssociationT` class that takes a `bool Mutable` as third template argument. The +same approach is also followed by the `AssocationCollectionIterator`s: + +```cpp +template +class AssociationT; + +template +using Association = AssociationT; + +template +using MutableAssociation = AssociationT; +``` + +All podio generated datatypes have the following three public typedefs +- `DefT` yields the default type +- `MutT` yields the mutable type +- `CollT` yields the collection type + +There are corresponding template helpers to retrieve these types in +`AssocationFwd.h`. Note that these are not +[*SFINAE*](https://en.cppreference.com/w/cpp/language/sfinae) friendly. However, +since they are only used in contexts internally in non-SFINAE contexts, this +doesn't really matter. + +Throught the implementation it is assumed that `FromT` and `ToT` always are the +default types. This is ensured through `static_assert`s, resp. through usage of +the aforementioned template helpers at the possible entry points. With this in +mind, effectively all mutating operations on `Association`s are defined using +SFINAE using the following template structure (taking here `setFrom` as an +example) + +```cpp +template , FromT>>> +void setFrom(FromU value); +``` + +This is a SFINAE friendly way to ensure that this definition is only viable if +the following conditions are met +- The object this method is called on has to be `Mutable`. (first part inside the `std::enable_if`) +- The passed in `value` is either a `Mutable` or default class of type `FromT`. (second part inside the `std::enable_if`) + +In some cases the template signature looks like this + +```cpp +template> +void setWeight(float weight); +``` + +The reason to have a defaulted `bool` template parameter here is the same as the +one for having a `typename FromU` template parameter above: SFINAE only works +with deduced types. Using `Mut && Mutable` in the `std::enable_if` makes sure +that users cannot bypass the immutability by specifying a template parameter +themselves.