Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert conll dataset format #78

Open
ghost opened this issue Oct 31, 2019 · 2 comments
Open

convert conll dataset format #78

ghost opened this issue Oct 31, 2019 · 2 comments

Comments

@ghost
Copy link

ghost commented Oct 31, 2019

Hi
I really appreciate if you could assist me with this quesiton, I would like to convert the conll dataset format to NLI dataset format, in whcih one has one sentence, and replace the pronoun with each of the two antecedent, and then the correct one is entailment label and incorrect one is contradiction. I have two questions:

  • which information in the conll dataset your code uses? Do you also use cluster information and speaker id? I am really confused by all of these extra information and not sure if this is a part of your method.
  • I really appreciate to tell me how I can convert the conll dataset to the NLI format, is there any codes for this?
  • if one train the conll dataset like NLI with BERT model, do you think the performance could possibly suffer? I am wondering which extra information your code uses and if they have an impact?
    thanks.
@henryhust
Copy link

The first question I can anwser you , It uses clusters, speakers, genres as features, but the speakers and genres is not necessary.

@henryhust
Copy link

The second question maybe solved by https://zhuanlan.zhihu.com/p/121786025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant