Skip to content
This repository has been archived by the owner on Mar 6, 2019. It is now read-only.

CRF Output #6

Open
prakhar21 opened this issue Jul 15, 2016 · 2 comments
Open

CRF Output #6

prakhar21 opened this issue Jul 15, 2016 · 2 comments

Comments

@prakhar21
Copy link

Hi, I am not able to understand to what does these tab separated fields mean.

1            I1      L8      NoCAP  NoPAREN  B-QTY
cup          I2      L8      NoCAP  NoPAREN  B-UNIT
white        I3      L8      NoCAP  NoPAREN  B-NAME
wine         I4      L8      NoCAP  NoPAREN  I-NAME

Please, help me out.

Thanks

@ericagreene
Copy link
Contributor

@prakhar21 Those are a list of the tokens (words) and the associated features. The associated code is here. The on the right is the tag that we're trying to predict.

Does that answer your question?

@prakhar21
Copy link
Author

@ericagreene Thanks, that answers my question. There is one more thing that, I wanted to clarify.
When I am training on all 180k data and then using my own dataset as validation then, why is it like the predictions that it made with 20k data model are more accurate compared to 180k data model. This is against model training principles. My understanding says, more data is always good for training purpose. Please, share your thoughts on this.

Thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants