CUDA: generalize FP16 fattn vec kernel (#7061) #33
Job | Run time |
---|---|
2m 26s | |
1m 53s | |
2m 33s | |
2m 8s | |
2m 48s | |
2m 27s | |
1m 52s | |
2m 25s | |
1m 16s | |
1m 17s | |
1m 36s | |
4m 17s | |
5m 54s | |
2m 47s | |
5m 47s | |
3m 23s | |
2m 1s | |
5m 18s | |
21m 59s | |
4m 54s | |
7m 45s | |
24m 25s | |
5m 44s | |
6m 21s | |
7m 28s | |
19m 41s | |
1m 48s | |
14m 52s | |
4m 25s | |
6m 23s | |
5m 44s | |
5m 25s | |
5m 45s | |
1m 55s | |
3h 16m 42s |