Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About interpolation approach #98

Open
pehonnet opened this issue May 13, 2019 · 0 comments
Open

About interpolation approach #98

pehonnet opened this issue May 13, 2019 · 0 comments

Comments

@pehonnet
Copy link

Hi,

I understand from the motivation doc that the idea is to interpolate at the ngram count level, based on a dev set. So, when you want to generate a new LM based on for example 3 sources (trainA, trainB, trainC), you would get your count, find optimal weights based on dev, and then build the LM (lm_1). What would you suggest as an optimal way to create a new LM, when you get only a new source, so you have your initial 3 sources plus a new one (trainA, trainB, trainC and trainD)? Should we simply reuse the counts from the previous training, and the ones from the new set, and interpolate based on the (probably new) dev set?
If I understood correctly, it is the only way to do the interpolation with this tool? I.e. we can't use the previously built LM (lm_1) and interpolate somehow with the new data?

Thanks

PS: there is a typo in motivation.md (search "estmiate")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant