Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter optimisation #3

Open
GMCobraz opened this issue May 15, 2021 · 4 comments
Open

Hyperparameter optimisation #3

GMCobraz opened this issue May 15, 2021 · 4 comments
Labels
question Further information is requested

Comments

@GMCobraz
Copy link

Dear caponetto and others,

Good day,
i am working on the hyperparameter optimisation.
I have some questions.
May I know, what is the g refers to in prior.py?
scatter_matrix = (data_matrix_cov / g).T

Thank you very much

@caponetto caponetto added the question Further information is requested label May 15, 2021
@caponetto
Copy link
Owner

Hi @GMCobraz,
The way the prior is being created here comes from the original matlab code, where g scales empirical prior cov S.

@GMCobraz
Copy link
Author

Hi @caponetto ,
Thanks for the reply.
image
Based on my understanding, cov_matrix = g * scatter_matrix, where g is 1/(N-1).
I am not understand why g for bhc and brt are different for the example, since N=18.

May I know why in prior.py, degrees_of_freedom = data.shape[1] + 1?

Is the scale factor can be understand as learning rate in this case?

Thank you very much.

@GMCobraz
Copy link
Author

GMCobraz commented May 16, 2021

Hi @caponetto ,

May I know do you have an idea to implement the hyperparameter optimisation?
I am stuck halfway in implementing the brent method to optimize gamma for rose tree.
I try to implement this clustering method into my master thesis.
I will appreciate very much if you can give me some guidelines for the optimization

Thank you very much.

@caponetto
Copy link
Owner

Hi @GMCobraz,

Unfortunately, I don't have details about the behavior of the hyperparameters.
Maybe this is a good subject for exploring as part of your research?
If you do so, please share the results. :)

Also, notice that the example was taken from the original paper.
So I suppose that the hyperparameters are optimized for BHC.
For BRT, on the other hand, they are not optimized.
My goal was to use the same example for both strategies and provide an intuition.

Regarding the hyperparameter optimization itself, I suggest you take a look at Random Search [1].

[1] https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants