Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add exponential simplex transform #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

spinkney
Copy link
Collaborator

After some algebra and cancellation this transform is ready. By preliminary wall clock time and ESS this performs really well.

@spinkney
Copy link
Collaborator Author

@bob-carpenter @sethaxen this is updated from the emails I sent out. It seems to work quite well.

Copy link
Collaborator

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding! Is the derivation written down somewhere? I may have missed it in one of the e-mails, but can you say something about the rationale behind this transform? So far as I can tell, it's very similar to Expanded-Softmax. In particular, if $f$ is the Expanded-Softmax constraining transform, then as $y_i \to \infty$, this becomes $f(y)$, and as $y_i \to -\infty$, it becomes $f(2y)$. Due to metric adaptation, we wouldn't expect $f(2y)$ to perform any differently than $f(y)$, so really it's just in $\sim(-3, 3)$ when it smoothly transitions between the two that we expect a difference.

On that note, is there a more descriptive name than "Exponential"?

Can you also add a jax implementation and add it to the list of transforms in the tests so that the Jacobian determinant is automatically checked?

@spinkney
Copy link
Collaborator Author

Thanks for adding! Is the derivation written down somewhere?

I'll write it out and send it.

So far as I can tell, it's very similar to Expanded-Softmax. In particular, if f is the Expanded-Softmax constraining transform, then as y i → ∞ , this becomes f ( y ) , and as y i → − ∞ , it becomes f ( 2 y ) . Due to metric adaptation, we wouldn't expect f ( 2 y ) to perform any differently than f ( y ) , so really it's just in ∼ ( − 3 , 3 ) when it smoothly transitions between the two that we expect a difference.

I was surprised at how exp(y) / (1 + exp(-y)) had higher ess than exp(y) for a positive transform. This is probably just in the range of the inits (-2, 2) that you note is likely. I'm just thinking that a slower increasing function may perform better.

On that note, is there a more descriptive name than "Exponential"?

It's more like a generalized logit function I guess.

Can you also add a jax implementation and add it to the list of transforms in the tests so that the Jacobian determinant is automatically checked?

Ooph. Yea I'll have to use an llm to help me with that but I can try!

@bob-carpenter
Copy link
Owner

exp(y) / (1 + exp(-y)) had higher ess than exp(y) for a positive transform.

That's interesting. I found that log(1 + exp(y)) did not perform better than exp(y) for lognormal, gamma, or inverse gamma or weibull distributions. It did a little bit better for half normal distributions. We should've started with an analysis of positive transforms.

@sethaxen
Copy link
Collaborator

Ooph. Yea I'll have to use an llm to help me with that but I can try!

If it helps, check out the last 3 commits on https://github.com/bob-carpenter/transforms/tree/ilr2. Following the same approach would work here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants