You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, i'd like to ask, what would be the actual meaning of the 'metaparameter' floats which can be given to the training script? Is there some guideline how to manually choose reasonable default 'metaparameters', say when training a phone-based trigram LM from a single corpus? (can be handy when the automatic method fails...) Thanks! K.
The text was updated successfully, but these errors were encountered:
Hm. It's various discounting-related parameters appearing in a formula,
and at this second I can't recall the exact meaning of each one.
I assume the amount of data you have is pretty small. In this case it may
be easier to just write a script to estimate a Kneser-Ney LM. Maybe we
could have someone here help with that, it would be a good exercise for
some of the students. It's a shame that pocolm doesn't handle these
corner cases very well. Are you avoiding SRILM because of license reasons?
On Tue, Jul 11, 2017 at 10:17 AM, Karel Vesely ***@***.***> wrote:
Hi, i'd like to ask, what would be the actual meaning of the
'metaparameter' floats which can be given to the training script? Is there
some guideline how to manually choose reasonable default 'metaparameters',
say when training a phone-based trigram LM from a single corpus? (can be
handy when the automatic method fails...) Thanks! K.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#90>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu5XlyzgajhDDAw_XTeu7NAaRK7--ks5sM4OKgaJpZM4OUTZL>
.
Did you talk about the --bypass-metaparameter-optimization option provided by train_lm.py.
If it is that, I think you can't choose any default values by manually. In order to get the approaviate numbers for them, one has to run train_lm.py without that option for one time and find the numbers in log.
I think this is just a option to speedup the training when someone others would like to reproduce a model on same dataset. If dataset is small, you can ignore this option and run the training, it won't take much time.
Hi, i'd like to ask, what would be the actual meaning of the 'metaparameter' floats which can be given to the training script? Is there some guideline how to manually choose reasonable default 'metaparameters', say when training a phone-based trigram LM from a single corpus? (can be handy when the automatic method fails...) Thanks! K.
The text was updated successfully, but these errors were encountered: