[LLM] Add quantize_kv optimization for yuan2 model #4094
Job | Run time |
---|---|
1s | |
2s | |
2s | |
2s | |
2s | |
5s | |
2s | |
1m 2s | |
2m 15s | |
2m 3s | |
3m 28s | |
52s | |
50s | |
10m 39s | |
14m 46s | |
1s | |
1s | |
36m 13s |
Job | Run time |
---|---|
1s | |
2s | |
2s | |
2s | |
2s | |
5s | |
2s | |
1m 2s | |
2m 15s | |
2m 3s | |
3m 28s | |
52s | |
50s | |
10m 39s | |
14m 46s | |
1s | |
1s | |
36m 13s |