Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entropy is NaN for MatrixDirichlet with 0 entries #137

Open
MagnusKoudahl opened this issue Apr 11, 2023 · 12 comments
Open

Entropy is NaN for MatrixDirichlet with 0 entries #137

MagnusKoudahl opened this issue Apr 11, 2023 · 12 comments
Assignees

Comments

@MagnusKoudahl
Copy link

When calculating the entropy of a MatrixDirichlet with 0 entries, the result is NaN.

This follows from the entropy function making an elementwise call to SpecialFunctions.loggamma on dist.a which evaluates to Inf on each 0 entry.

This means that when learning parameters of an HMM for example, a marginal that evaluates to I will break FE calculation.

@bvdmitri
Copy link
Member

@ismailsenoz @ThijsvdLaar WDYT?

@ThijsvdLaar
Copy link

Good catch. In the entropy for the MatrixDirichlet, for a column with a zero entry (e.g. [2.0; 0.0]), the NaN value stems from a -Inf+Inf evaluation (first summation term plus last summation term). Introduced NaN values are notoriously hard to debug, so it would be good to prevent returning NaN.

Numerically, for a parameter matrix with an entry approaching zero, the entropy appears to approach -Inf. This would be good to verify mathematically. If this holds then we could return -Inf for the entropy; similar to how other Julia functions handle singularities.

I'd say it's up to the user to choose vague priors with appropriate epsilon approximations to prevent singularities in their algorithm.

@ismailsenoz
Copy link
Collaborator

Passing $[2,0]$ as a parameter vector to the Dirichlet distribution or any parameters vector containing $0$ is not allowed. Dirichlet distribution is defined for an $\alpha$ where each $\alpha_i > 0$. What @MagnusKoudahl is trying to do is a violation of the theory. Please see the definition of a Dirichlet distribution. I do not see the point of this as an issue since there is nothing wrong with the computation. As a way around SpecialFunctions, implement the xlogx function that defines $0\log(0) \triangleq 0$, and for discrete random variables, this will avoid inf and nans in the entropy computations. However, the same trick won't work in the case of continuum. But there is no need for such a convention here because the case of $\alpha$ with zero entries violates domain specification and is not an issue of ReactiveMP.

@ThijsvdLaar
Copy link

Good point, in that case I think it might be good to throw a DomainError to prevent confusion.

@ismailsenoz
Copy link
Collaborator

Yes, we can throw an assertion statement that checks the parameter vector does not contain 0. Or we can add a jitter assuming that the user is trying to pass a small value and is not aware of the domain specifications for distributions. I am not a big fan of jitters, but I also understand that getting a NaN error is annoying, and avoiding that might also be beneficial. @MagnusKoudahl Which one would have been more useful in your experience? To get an assertion error or injection of tinies automatically? @ThijsvdLaar @bvdmitri I am fine either way or open to other alternatives.

@MagnusKoudahl
Copy link
Author

I would be in favour of throwing an error then. Adding tinies can also lead to unexpected behaviours and hiding it from the user makes it really hard to debug. With an error the user is still in control over how they want to set up their model and can decide if adding tinies is the right call

@ismailsenoz
Copy link
Collaborator

But Dirichlet distribution is from Distributions.jl, and I think it is impossible to modify their structure. Maybe they already implement this kind of check.

@bvdmitri
Copy link
Member

Well as far as I remember the inference function actually should check for NaN/Infs already. It basically says something like Failed to compute node bound free energy component. The result is NaN` and same for entropies here: https://github.com/biaslab/RxInfer.jl/blob/main/src/score/bfe.jl#L57. Do you see this error or it is something different for you @MagnusKoudahl ?

@MagnusKoudahl
Copy link
Author

@bvdmitri That's the same error I see

@bvdmitri
Copy link
Member

MatrixDirichlet is our structure and we can add check for zero entries in the constructor. Technically diagonal prior should not be allowed and we should warn users about that.

In the meantime another workaround could be the functional form constraints, which would allow you to add jitter @MagnusKoudahl . See the corresponding section in the documentation: https://biaslab.github.io/ReactiveMP.jl/stable/custom/custom-functional-form/#custom-functional-form-example .

@albertpod
Copy link
Member

@MagnusKoudahl do you have time to handle this?

@MagnusKoudahl MagnusKoudahl self-assigned this Jun 19, 2023
@MagnusKoudahl
Copy link
Author

Related to this issue, we also allow for other nonvalid parameterisations, ex

using ReactiveMP
my_var = NormalMeanVariance(0,-1)

does not throw an error. Is ensuring correct parameterisation something worth spending time on more generally?

@bvdmitri bvdmitri transferred this issue from ReactiveBayes/ReactiveMP.jl Oct 5, 2023
@albertpod albertpod moved this to 🤔 Ideas in RxInfer Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🤔 Ideas
Development

No branches or pull requests

5 participants