Acquiring more labelled training images #81

davidwagner · 2020-12-18T09:08:10Z

Currently the 0.0.4 dataset provides 125 training images of each class. If we want to train on more images, are there any resources to make it easier to acquire more labelled images that are valid and unambiguous, or do we need to re-implement the tasker evaluation ourselves?

If we use the IDs in bird-or-bicycle/bird_or_bicycle/metadata/0.0.4/, it looks like we can get close to 1000 more images of birds that have been verified by taskers, but no more images of bicycles are available for training from there. Anything else I am missing?

carlini · 2020-12-18T09:17:40Z

I don't think we've collected more high quality labeled examples in train. The extra dataset has something like 27k more images that we've found helpful for training a classifier. I've been able to train a single linear layer on top of ImageNet features using the extra dataset to get ~99% test accuracy. But as you say, they're not filtered correctly.

davidwagner · 2020-12-18T21:35:02Z

Thank you. Seems like getting more images of bicycles might take the most work. In my random sample of bicycles from extras/, 1/20 (5%) looked to me like they meet the requirements; I took another random sample, and 4/34 (12%) looked to me like they met the requirements; though I see from tasker_labels_0.0.4.csv that about 289/1322 (22%) met the requirements. I'm not sure why there was such variability among those three estimates (perhaps you all did some filtering before feeding images to taskers? or perhaps I just got unlucky in my random samples?). So if we filter extras, I'm guessing we might be able to obtain ~ 10000 good training images of birds and between 800-3000 good training images of bicycles, but this will require us to do the filtering ourselves. Thanks for the information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acquiring more labelled training images #81

Acquiring more labelled training images #81

davidwagner commented Dec 18, 2020

carlini commented Dec 18, 2020

davidwagner commented Dec 18, 2020

Acquiring more labelled training images #81

Acquiring more labelled training images #81

Comments

davidwagner commented Dec 18, 2020

carlini commented Dec 18, 2020

davidwagner commented Dec 18, 2020