You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A quick write-up of some key ideas developed in a discussion we had here at NetSI today.
For classification problems, we have the problem that we need to calculate the likelihood based on paths that included unobserved nodes or transitions. This is the case in the Wikispeedia data set, where some of the possible transitions may or may not appear in different subsets of the data.
We can fix this using Laplacian Smoothing, i.e. we add a small epsilon probability to events that are possible, but that have not been observed.
For this purpose, we need to be able to tell the MultiOrderModel class what events are possible, independent of the transitions that have been observed in the Paths object used to fit the model. This requires an additional parameter in which we can set the "topological constraint" A (e.g. in terms of a a binary adjacency matrix) for the underlying model. In general, this can be at any order but for now we restrict ourselves to a first-order topology.
In the calculation of transition probabilities, this constraint is taken into account in different ways:
In the zero order model, we add a transition start -> x with probability epsilon whenever the constraint topology contains a node x that has not been observed in the paths used to fit the model
In the first order model, we add a transition x -> y with probabyility epsilon whenever the topology contains a link that has not been observed in any path
In the higher-order models, we add transitions with probability epsilon whenever such a transition is theoretically possible in the given topology.
Here's a scribble of the basic solution in the Wikispeedia case:
No description provided.
The text was updated successfully, but these errors were encountered: