GBT Model #1

gururgg · 2024-10-31T08:56:28Z

Hello, I’m interested in adapting the GBT for my research. In the paper, it’s mentioned that "the AWMA-based transformer module selectively removes components of the attention weight matrix with smaller singular values." However, I haven’t been able to locate the specific part of the code where this modification to the weight matrix occurs.

Additionally, from my understanding of the implementation, it seems that the LRR loss is applied directly to the raw node features. I’d appreciate it if you could confirm whether I’m interpreting this correctly or if there’s a part of the code I might have missed.

Thank you very much for your time and assistance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GBT Model #1

GBT Model #1

gururgg commented Oct 31, 2024

GBT Model #1

GBT Model #1

Comments

gururgg commented Oct 31, 2024