Replies: 2 comments
-
Hi @DavidZyy, this is simply an empirical correction, there is no science behind it (and it was amusing to observe people trying to make scientific sense out of it). From the pre-imatrix days we have learned that it is better to assign higher weights (importance) to model weights with larger magnitudes in a weighted RMSE minimization. As there is no precise science behind that, it was just a matter of experimentation to determine how this higher importance should look like ( Why
Why the need for correcting the Hessian in the first place?
|
Beta Was this translation helpful? Give feedback.
-
Thanks for taking time to answer this question and share information, I learned a lot from your answers.
|
Beta Was this translation helpful? Give feedback.
-
Hi @ikawrakow, your work on quantization is amazing and I really admire them. Recently, I am reading codes about this and have some questions.
For example, at funtion
quantize_row_q4_0_impl
and other places,weight[j]
is:weight[j] = qw[j] * sqrtf(sigma2 + xb[j]*xb[j]);
I already see some discussions at here, but I still don't quite understand, Can you give me some guidance? Why do not use the following directly?
Beta Was this translation helpful? Give feedback.
All reactions