About args use_tvm #24

htg17 · 2023-02-20T03:27:42Z

In long_range_main.py the use_tvm arg is set to be default FALSE, and in the sample scripts this arg is not triggered. But if this arg is FALSE, it seems that pyramidal attention is not used in the whole model, which is the main contribution of the paper.

So if this arg should be set TRUE when I want to use pyramidal attention to save computation lost?

Zhazhan · 2023-02-20T04:39:37Z

We provide two implementations of pyramidal attention, namely the naive version and the TVM version, where the Naive version cannot reduce the complexity of time and space. Because the TVM version may require the user to compile the TVM, we set use_tvm=False by default to facilitate the reproduction of our results.

If you want to use the TVM implementation without compiling TVM, please set use_tvm=True and make sure: (1) the operating system is Ubuntu, (2) the CUDA version is 11.1. Otherwise, you can compile TVM 0.8.0 according to their official guide https://tvm.apache.org/docs/.

If you feel too troubled to compile, as an alternative, you can find a compiled TVM docker image from https://tvm.apache.org/docs/install/docker.html#docker-source. Then delete files under 'pyraformer/lib' and run the code again.

htg17 · 2023-02-20T08:30:45Z

Thanks for answering. I just wonder whether the naive choice is pyramidal attention.

If use_tvm=FALSE, the MultiHeadAttention in SubLayers.py is used as self-attention model. But it seems that the MultiHeadAttention is just a vanillla attention.

Zhazhan · 2023-02-20T09:12:13Z

The Naive implementation implements pyramidal attention by adding an attention mask to the attention score matrix. The 'MultiHeadAttention' module is indeed the vanilla attention. The differences lie in the 'Encoder' module. Please refer to line 19-22 and 51-54 in pyraformer/Pyraformer_LR.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About args use_tvm #24

About args use_tvm #24

htg17 commented Feb 20, 2023

Zhazhan commented Feb 20, 2023

htg17 commented Feb 20, 2023

Zhazhan commented Feb 20, 2023

About args use_tvm #24

About args use_tvm #24

Comments

htg17 commented Feb 20, 2023

Zhazhan commented Feb 20, 2023

htg17 commented Feb 20, 2023

Zhazhan commented Feb 20, 2023