You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a logical error in the file eval_trec.py, which evaluates skipthoughts on classification into 1 of 6 different question-types. The program uses a label-to-number mapping for classification, however these mappings (i.e. dictionaries) are generated fully independently while training and testing. To be precise, the below lines of code are executed for both training and testing:
d = {}
count = 0
setlabels = set(labels)
for w in setlabels:
d[w] = count
count += 1
idxlabels = np.array([d[w] for w in labels])
This means:
If the test set has a different set of labels than the training set, because one or more labels is not present, the program will almost certainly fail. It can even lead to 0% classification accuracy.
If the test set has the full set of labels, it turns out that the program works due to a sheer coincidence - python has a built-in, deterministic ordering of elements in sets, which makes the training and test dictionaries coincide. However, this is not expected behaviour from the 'set' data structure, so it is poor programming practice to rely on this.
FIX: The dictionary should be learnt only once, during training, and re-used for testing. The dictionary ought to be treated as a learned parameter, along with the Logistic Regression coefficients.
The text was updated successfully, but these errors were encountered:
niravbhan
changed the title
Conceptual error: Independent dictionaries are used in eval_trec.py for Train and Test data
Logical error: Independent dictionaries are used in eval_trec.py for Train and Test data
Jun 29, 2018
niravbhan
changed the title
Logical error: Independent dictionaries are used in eval_trec.py for Train and Test data
Conceptual error: Independent dictionaries are used in eval_trec.py for Train and Test data
Jun 29, 2018
There is a logical error in the file eval_trec.py, which evaluates skipthoughts on classification into 1 of 6 different question-types. The program uses a label-to-number mapping for classification, however these mappings (i.e. dictionaries) are generated fully independently while training and testing. To be precise, the below lines of code are executed for both training and testing:
d = {}
count = 0
setlabels = set(labels)
for w in setlabels:
d[w] = count
count += 1
idxlabels = np.array([d[w] for w in labels])
This means:
FIX: The dictionary should be learnt only once, during training, and re-used for testing. The dictionary ought to be treated as a learned parameter, along with the Logistic Regression coefficients.
The text was updated successfully, but these errors were encountered: