DAL query methods do not choose from the whole unlabelled set #1

erik-werner · 2018-11-08T10:11:02Z

unlabeled_idx = np.random.choice(unlabeled_idx, np.min([labeled_idx.shape[0]*10, unlabeled_idx.size]), replace=False)

I suppose the reason for this line is that you want a somewhat balanced train set for the discriminator. But when you do the selection I assume that you want to use the whole set. That's how it's described in the paper.

I looked into this because I found it hard to believe Fig. 3c of your paper. That the two selection algorithms should be perfectly independent seems inconceivable to me. When I tried to reproduce your results (after fixing the issue) I instead found a small anti-correlation between DAL and entropy selection. I have not tested this at all, so it could be a mistake on my part. But if it holds up I think this is even more interesting than no correlation.

The text was updated successfully, but these errors were encountered:

dsgissin · 2019-02-26T11:14:12Z

Hey Erik, I'm sorry but for some reason I only saw your issue now... Thanks for looking into this.
Figure 3 was produced using different code which I didn't upload here, and the problem you mentioned isn't present there. I tried re-running these experiments following your comments and was unable to reproduce a result which has an anti-correlation (the results remained the same as in the figure).

I want to think that my experiments were correct, but if you for some reason happen to still have the code you ran to reproduce this experiment I would be very happy to look into it.

Thanks

d12306 · 2019-09-17T09:15:57Z

@erik-werner , @dsgissin ,Hi, thank you for posting this question out. But I am wondering how do you fix the issue. I mean training a discriminator on a so unbalanced dataset will lead to significantly biased prediction result. I mean the discriminator is prone to predict all the samples to one class (the class that is with much more samples) in order to let the loss decrease. How do you think of this issue.
Hopefully you can help me with this issue.

Thanks,

dsgissin · 2019-09-17T11:47:45Z

What I did originally was to weight the classes proportionally to the number of examples in the dataset, and use a large batch size to make sure that every batch has examples from the class that is under-represented.
I also sub-sample the unlabeled set so that the under-representation of the labeled set wouldn't be too large.

These are relatively simple ways of dealing with the problem of unbalanced classes, but they were enough to reach the results in the paper.

d12306 · 2019-09-17T12:22:48Z

Hi, @dsgissin , Thanks, but what I mean is that the discriminator. I mean the number of images in the labeled set is too small compared to that of the unlabeled set. Say 1000 labeled and 40000 unlabeled. So the discriminator is biased. Also in the code, you randomly sample 10 times unlabeled images than labeled images for training the discriminator. So how do you weight classes? I mean we do not know the labels of the unlabeled images though. Actually, I am not quite sure about the meaning of " under-representation " here.
May be I am asking a naive question, I am sorry if I misunderstand you.

Thanks,

dsgissin · 2019-09-17T12:40:42Z

No problem, I'll try to explain more clearly.

What I do in the code (which is only one option which happened to work OK and can probably be improved), is subsample the unlabeled set so that there is only ten times the examples of the unlabeled set. This is still very unbalanced, but is manageable.
Then, I weight the labeled examples ten times that of the unlabeled examples (the loss is ten times as large for them).
Finally, I use large batch sizes to ensure that the labeled examples appear in all of the batches with high probability, so I don't get optimization issues.

Some clarifications:

The discriminator doesn't care what the actual labels of the unlabeled set are - it simply tries to discriminate between the labeled set (labeled as "1", for example) and the unlabeled set (labeled as "0", for example). So in this case, the weight of the "1" class is ten times that of the "0" class. The upside of this algorithm is in fact that you are agnostic to the ML task you are running active learning for - you don't care about the actual labels of the data, only that the labeled set will resemble the unlabeled set as much as possible.
Even with all the things I do for balancing out the binary classification task, it's true that the discriminator is probably biased towards the unlabeled set. However, that shouldn't hurt us too bad, since we are assuming it will still be most confident for examples which are most different from the labeled set. We only care about taking the top-K examples from the current unlabeled batch for labeling, and since we use early stopping in the training, we should expect the discriminator to be most confident about the examples which are easiest to differentiate from the labeled examples - these are the most different examples from the current labeled set, which diversify it the most and hopefully make the labeled set similar to the unlabeled set.

I hope that's more clear,
Daniel

d12306 · 2019-09-17T14:37:27Z

Thanks, I got what you mean, @dsgissin , one more quick question, it seems like the training of the discriminator will gradually becomes adversarial training-flavored. I mean for examples, there are some dogs in the labeled set and there are some more dogs in the unlabeled set when the labeled set enlarges its size. The representations are intuitively similar but we still try to distinguish them. How do you think that will affect the model?

dsgissin · 2019-09-17T14:48:32Z

I'm not quite sure what you mean.
In real data we don't expect the representations to be really similar for the entire unlabeled set - there should always be examples that are different enough to be singled out by the discriminator. If the representations really are similar, than you're in good shape because the labeled set is similar to the unlabeled set (which is what we wanted).

d12306 · 2019-09-18T02:23:43Z

Thanks for your clarification.

d12306 · 2021-01-09T10:59:41Z

@dsgissin , hi, sorry for bothering you again, but how you select the hyperparameters, such as the iterations, the number of starting samples, etc, in the experiments? It seems like the validation dataset is only used in training the model and decide which checkpoint to use, but the other hyperparameters are arbitrarily used?

Probably better to set the same validation set for tuning these hyperparameters?

Thanks,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAL query methods do not choose from the whole unlabelled set #1

DAL query methods do not choose from the whole unlabelled set #1

erik-werner commented Nov 8, 2018

dsgissin commented Feb 26, 2019

d12306 commented Sep 17, 2019 •

edited

Loading

dsgissin commented Sep 17, 2019

d12306 commented Sep 17, 2019

dsgissin commented Sep 17, 2019

d12306 commented Sep 17, 2019

dsgissin commented Sep 17, 2019

d12306 commented Sep 18, 2019

d12306 commented Jan 9, 2021

DAL query methods do not choose from the whole unlabelled set #1

DAL query methods do not choose from the whole unlabelled set #1

Comments

erik-werner commented Nov 8, 2018

dsgissin commented Feb 26, 2019

d12306 commented Sep 17, 2019 • edited Loading

dsgissin commented Sep 17, 2019

d12306 commented Sep 17, 2019

dsgissin commented Sep 17, 2019

d12306 commented Sep 17, 2019

dsgissin commented Sep 17, 2019

d12306 commented Sep 18, 2019

d12306 commented Jan 9, 2021

d12306 commented Sep 17, 2019 •

edited

Loading