Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Get/Set for vectors and use them to implement Concat* operators for RVV #2362

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lsrcz
Copy link
Contributor

@lsrcz lsrcz commented Oct 23, 2024

This pull request

  1. implements the Get and Set operators for RVV, which correspond to the __riscv_vget* and __riscv_vset* intrinsics, respectively.
  2. optimizes the Concat* operators with Get and Set.

There are new type categories introduced (_GET_SET, _GET_SET_VIRT, _GET_SET_SMALLEST). The reason why new categories are introduces is:

  1. RVV only provide vget and vset intrinsics for non-fractional vector types as the whole register movement instructions are only available for LMUL >= 1. This corresponds to the _GET_SET category. We cannot simply reuse the _TRUNC category as it contains fractional vector types.
  2. The _GET_SET_VIRT category simulates the intrinsics using instructions other than whole register movement. The result is a register with half in LMUL.
  3. Similar to LowerHalf, it is useful to have Get and Set also for the smallest LMUL for each SEW. This is the _GET_SET_SMALLEST category, and the result would be the same as the original register type for this category.

For the Get operator for fractional vector types, we use Trunc, which is effectively a no-op, to extract the lower part, and we use SlideDown for the upper parts.

For the Set operator for fractional vector types, we use vmv with the tail undisturbed configuration to set the lower part. We use SlideUp for the upper parts.

The Concat operator can then be optimized with Get and Set. For operators other than ConcatLowerUpper, we need a Get and a Set, and the program should be optimal for all SEW/LMUL combinations. For the ConcatLowerUpper operator, we will need two Gets and two Sets. This would be optimal only when the Get operation is always a no-op, meaning that the LMUL for the extracted part must be greater or equal to zero.

@lsrcz lsrcz marked this pull request as draft October 25, 2024 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant