Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on Ontonotes v5.0 #15

Open
hary00078 opened this issue Aug 1, 2020 · 0 comments
Open

Performance on Ontonotes v5.0 #15

hary00078 opened this issue Aug 1, 2020 · 0 comments

Comments

@hary00078
Copy link

hary00078 commented Aug 1, 2020

Hi

First of all, Thanks for your last reply.
As your command I execute model with Ontonotes v5.0.
Although your official f1-score is 88.16%, I always get 85%.
When I execute your model with UD, I got very good performance. So I think I have something mistake.

It is my command.
python main.py --learning_rate 0.01 --lr_decay 0.035 --dropout 0.5 --hidden_dim 400 --lstm_layer 4 --momentum 0.9 --whether_clip_grad True --clip_grad 5.0 --train_dir 'data/onto.train.txt' --dev_dir 'data/onto.development.txt' --test_dir 'data/onto.test.txt' --model_dir 'model/' --word_emb_dir 'glove.6B.100d.txt'

It is summary.
DATA SUMMARY START:
I/O:
Tag scheme: BIO
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: False
Word alphabet size: 69812
Char alphabet size: 119
Label alphabet size: 38
Word embedding dir: glove.6B.100d.txt
Char embedding dir: None
Word embedding size: 100
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: data/onto.train.txt
Dev file directory: data/onto.development.txt
Test file directory: data/onto.test.txt
Raw file directory: None
Dset file directory: None
Model file directory: model/
Loadmodel directory: None
Decode file directory: None
Train instance number: 115812
Dev instance number: 15679
Test instance number: 12217
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: False
Model word extractor: LSTM
Model use_char: True
Model char extractor: LSTM
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.035
Hyper HP_clip: 5.0
Hyper momentum: 0.9
Hyper l2: 1e-08
Hyper hidden_dim: 400
Hyper dropout: 0.5
Hyper lstm_layer: 4
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.

I think I follow the hyperparameters well written in your paper.
Is there any mistake?

Thanks for reading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant