-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAL query methods do not choose from the whole unlabelled set #1
Comments
Hey Erik, I'm sorry but for some reason I only saw your issue now... Thanks for looking into this. I want to think that my experiments were correct, but if you for some reason happen to still have the code you ran to reproduce this experiment I would be very happy to look into it. Thanks |
@erik-werner , @dsgissin ,Hi, thank you for posting this question out. But I am wondering how do you fix the issue. I mean training a discriminator on a so unbalanced dataset will lead to significantly biased prediction result. I mean the discriminator is prone to predict all the samples to one class (the class that is with much more samples) in order to let the loss decrease. How do you think of this issue. Thanks, |
What I did originally was to weight the classes proportionally to the number of examples in the dataset, and use a large batch size to make sure that every batch has examples from the class that is under-represented. These are relatively simple ways of dealing with the problem of unbalanced classes, but they were enough to reach the results in the paper. |
Hi, @dsgissin , Thanks, but what I mean is that the discriminator. I mean the number of images in the labeled set is too small compared to that of the unlabeled set. Say 1000 labeled and 40000 unlabeled. So the discriminator is biased. Also in the code, you randomly sample 10 times unlabeled images than labeled images for training the discriminator. So how do you weight classes? I mean we do not know the labels of the unlabeled images though. Actually, I am not quite sure about the meaning of " under-representation " here. Thanks, |
No problem, I'll try to explain more clearly. What I do in the code (which is only one option which happened to work OK and can probably be improved), is subsample the unlabeled set so that there is only ten times the examples of the unlabeled set. This is still very unbalanced, but is manageable. Some clarifications:
I hope that's more clear, |
Thanks, I got what you mean, @dsgissin , one more quick question, it seems like the training of the discriminator will gradually becomes adversarial training-flavored. I mean for examples, there are some dogs in the labeled set and there are some more dogs in the unlabeled set when the labeled set enlarges its size. The representations are intuitively similar but we still try to distinguish them. How do you think that will affect the model? |
I'm not quite sure what you mean. |
Thanks for your clarification. |
@dsgissin , hi, sorry for bothering you again, but how you select the hyperparameters, such as the iterations, the number of starting samples, etc, in the experiments? It seems like the validation dataset is only used in training the model and decide which checkpoint to use, but the other hyperparameters are arbitrarily used? Probably better to set the same validation set for tuning these hyperparameters? Thanks, |
unlabeled_idx = np.random.choice(unlabeled_idx, np.min([labeled_idx.shape[0]*10, unlabeled_idx.size]), replace=False)
I suppose the reason for this line is that you want a somewhat balanced train set for the discriminator. But when you do the selection I assume that you want to use the whole set. That's how it's described in the paper.
I looked into this because I found it hard to believe Fig. 3c of your paper. That the two selection algorithms should be perfectly independent seems inconceivable to me. When I tried to reproduce your results (after fixing the issue) I instead found a small anti-correlation between DAL and entropy selection. I have not tested this at all, so it could be a mistake on my part. But if it holds up I think this is even more interesting than no correlation.
The text was updated successfully, but these errors were encountered: