LLM输出中，每秒多少speech-token 呀 #113

zzchust · 2024-11-26T03:54:38Z

No description provided.

sixsixcoder · 2024-11-26T04:02:00Z

在这个软硬件环境中

GPU A800-SXM4-80GB
cuda 12.1
torch 2.4.0
torchaudio 2.4.0
transformers 4.45.2
python 3.10
显存 80G
精度 BF16
GPU 个数 1
top_p = 1.0
temperature = 1.0
max_new_tokens = 256

我的测试结果，我迭代了3次，计算了平均首token时延和平均解码时延（仅个人测试，不代表官方评测数据）

Average First Token Time over 3 iterations: 0.0907 seconds
Average Decode Time per Token over 3 iterations: 22.7574 tokens/second

zzchust · 2024-11-27T03:48:22Z

嗯嗯，输入平均每秒12.5token，输出的speeck-codebook 和输入不是同一个codebook吗？

sunnnnnnnny · 2024-11-28T03:58:28Z

在这个软硬件环境中
GPU A800-SXM4-80GB
cuda 12.1
torch 2.4.0
torchaudio 2.4.0
transformers 4.45.2
python 3.10
显存 80G
精度 BF16
GPU 个数 1
top_p = 1.0
temperature = 1.0
max_new_tokens = 256
我的测试结果，我迭代了3次，计算了平均首token时延和平均解码时延（仅个人测试，不代表官方评测数据）
Average First Token Time over 3 iterations: 0.0907 seconds
Average Decode Time per Token over 3 iterations: 22.7574 tokens/second

你好，“Average First Token Time over 3 iterations: 0.0907 seconds”，这个token指LLM预测的首speech token吗？
“22.7574 tokens/second” 这个是指flow matching 1s可解码22.7个speech token到mel吗？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM输出中，每秒多少speech-token 呀 #113

LLM输出中，每秒多少speech-token 呀 #113

zzchust commented Nov 26, 2024

sixsixcoder commented Nov 26, 2024

zzchust commented Nov 27, 2024

sunnnnnnnny commented Nov 28, 2024

LLM输出中，每秒多少speech-token 呀 #113

LLM输出中，每秒多少speech-token 呀 #113

Comments

zzchust commented Nov 26, 2024

sixsixcoder commented Nov 26, 2024

zzchust commented Nov 27, 2024

sunnnnnnnny commented Nov 28, 2024