Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On translation and rotation equivariance #96

Open
JLuij opened this issue Oct 8, 2024 · 4 comments
Open

On translation and rotation equivariance #96

JLuij opened this issue Oct 8, 2024 · 4 comments

Comments

@JLuij
Copy link

JLuij commented Oct 8, 2024

This may be a naive question but I'd like a clear answer as this claim is not discussed in the paper whereas it is in other pointcloud literature.

The questions is, is PTv3 a rotation and translation equivariant architecture?

Perhaps it's not that trivial due to the space-filling curve neighbours search, but nothing suggests that the architecture by default relies on absolute point location.

@Gofinge
Copy link
Member

Gofinge commented Oct 10, 2024

Hi, in my opinion, model-wise rotation and translation equivariant are not necessary. As long as we add rotation augmentation or translation augmentation to the training pipeline, you will find that the model output is relatively consistent, no matter the orientation of the point cloud. Open for discussion :).

@JLuij
Copy link
Author

JLuij commented Oct 10, 2024

Sure, that's valid however not the point of my question. My question is, is PTv3 a rotation and translation equivariant architecture, or is that property limited to particular layers or is no part of PTv3 rotation/translation equivariant?

@adithyamurali
Copy link

I believe the architecture is not rotation and translation equivariant by design

@Gofinge
Copy link
Member

Gofinge commented Oct 24, 2024

I believe the architecture is not rotation and translation equivariant by design

I agree, it is not designed to be rotation- and translation-equivariant, because I don’t think it matters much. For example, consider the Image Transformer: when we rotate or translate an image, the positional encoding changes, which means that attention on the image is also not rotation- or translation-equivariant, unlike convolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants