-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intuition of feature-wise sorting? #4
Comments
Another question: why restoring the order in the decoder can eliminate the need for the assignment-based loss? In this way, the decoder would output the elements in the same arbitrary order as input elements, however, the order of gt is fixed I think. |
I am also a little frustrated about figure1 (the color and the dashed box), for which I think a concrete example including input coordinate number and simple network transformation is more demonstrative. |
The point you quoted is somewhat separate from the bottleneck problem. In general, you can't really entirely eliminate the bottleneck problem when going from a set of vectors to a single vector. By making the pooling operation learnable, the idea in FSPool is that we can reduce the bottleneck problem by learning what information is relevant and being able to throw out information we don't care about. With that sentence, I'm referring to the following: some people might argue that because each feature is sorted independently, relationships between features within each element are lost. My argument is that an MLP before the pooling can learn to decorrelate the feature dimensions so that that isn't a problem.
Correct, the output is the same arbitrary order as the input elements. I don't know what you mean with the order of the ground-truth being fixed. The point is that regardless of what this ordering is, because the "first" element in the output set corresponds to the "first" element in the input set when you use FSUnpool, we can just use a normal pairwise mean squared error as loss now. There is no need for the assignment-based losses anymore, since now we essentially just have a sequence regression problem. For a concrete example, have a look at the video for ICLR 2020 https://iclr.cc/virtual_2020/poster_HJgBA2VYwH.html |
Hi, could you please shed some light on the feature-wise sorting?
Though this operation is permutation-invariant, I'm still having trouble understanding it.
In the paper it says "A transformation (such as with an MLP) prior to the pooling can ensure that the features being sorted are mostly independent so that little information is lost by treating the features independently."
Why can this operation help to solve the problem of a significant bottleneck when compressing a set of any size down to a single feature vector?
The text was updated successfully, but these errors were encountered: