-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate function not right #6
Comments
This computes the score wrong since if the model predict a wrong entity before all the good ones, the preds are not aligned and the score is 0, as shown in this example: |
Hi @airkid @DorianKodelja, I got with conclusion with you, according to DMCNN paper:
for item, item_ in zip(arguments, arguments_): Above code in this repo does match the idea, so I replaced that line with: ct += len(set(arguments) & set(arguments_)) # count any argument in golden
# for item, item_ in zip(arguments, arguments_):
# if item[2] == item_[2]:
# ct += 1 |
Hi @mikelkl , I believe this is a kind of right implementation of calculating F1 score in this task. |
Hi @airkid, I got slightly higher result, but it's on my own randomly splitting test set, hv no idea if it can efficively represent the paper result. |
Hi @mikelkl, can you try on the data split update by author? |
Hi @airkid, I'm afraid I cannot do that coz I hv no ACE2005 English data |
Hi @airkid Would you please tell me the result you got? I got only f1=0.64 in Trigger Classification. |
Hi, If you've tried their code, would you tell me your reproduced results on trigger detection and argument detection? |
https://github.com/lx865712528/JMEE/blob/494451d5852ba724d273ee6f97602c60a5517446/enet/testing.py#L72
In this line, if I add a line of code before
assert len(arugments) == len(argumenst_)
There will be assert error.
I believe this is because in
arugments
there are golden arguments while only predict arugments inarguments_
, which length will change dynamicly during traning.The text was updated successfully, but these errors were encountered: