Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xnet #790

Closed
wants to merge 2 commits into from
Closed

xnet #790

wants to merge 2 commits into from

Conversation

brucefan1983
Copy link
Owner

@brucefan1983 brucefan1983 commented Nov 19, 2024

usage in nep.in:

use_xnet # change to use the Cauchy activation function instead of tanh

Original activation in the paper is $f(x) = \frac{\lambda_1 x + \lambda_2}{x^2 + d^2}$, where $\lambda_1$, $\lambda_2$, and $d$ are trainable parameters.

I modified it to $f(x) = \frac{\lambda_1 x + \lambda_2}{x^2 + d^2 + 0.01}$ to avoid singularity. Is this necessary?

Ref: Cauchy activation function and XNet

Does not seem to be useful, will give up soon...

@BBBuZHIDAO
Copy link
Contributor

It's glad that there is a new activation function and you have achieved it so fast 🤩. But I have a small question for the modification of the denominator. It means d in non-modified equation must be bigger than 1. This setting will loss some functions, whose peak value is closer to 0.
I try to draw function $\frac{x}{x^2+d^2}$, $\lambda_1=1$, for different d. Here is the result:
Figure_1
Although $\lambda$ will scale the function, the inflexion point message nay loss.

@brucefan1983
Copy link
Owner Author

So far the feedbacks are negative :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants