Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM输出中,每秒多少speech-token 呀 #113

Open
zzchust opened this issue Nov 26, 2024 · 3 comments
Open

LLM输出中,每秒多少speech-token 呀 #113

zzchust opened this issue Nov 26, 2024 · 3 comments

Comments

@zzchust
Copy link

zzchust commented Nov 26, 2024

No description provided.

@sixsixcoder
Copy link

在这个软硬件环境中

GPU A800-SXM4-80GB
cuda 12.1
torch 2.4.0
torchaudio 2.4.0
transformers 4.45.2
python 3.10
显存 80G
精度 BF16
GPU 个数 1
top_p = 1.0
temperature = 1.0
max_new_tokens = 256

我的测试结果,我迭代了3次,计算了平均首token时延和平均解码时延(仅个人测试,不代表官方评测数据)

Average First Token Time over 3 iterations: 0.0907 seconds
Average Decode Time per Token over 3 iterations: 22.7574 tokens/second

@zzchust
Copy link
Author

zzchust commented Nov 27, 2024

嗯嗯,输入平均每秒12.5token, 输出的speeck-codebook 和输入不是同一个codebook吗?

@sunnnnnnnny
Copy link

在这个软硬件环境中

GPU A800-SXM4-80GB
cuda 12.1
torch 2.4.0
torchaudio 2.4.0
transformers 4.45.2
python 3.10
显存 80G
精度 BF16
GPU 个数 1
top_p = 1.0
temperature = 1.0
max_new_tokens = 256

我的测试结果,我迭代了3次,计算了平均首token时延和平均解码时延(仅个人测试,不代表官方评测数据)

Average First Token Time over 3 iterations: 0.0907 seconds
Average Decode Time per Token over 3 iterations: 22.7574 tokens/second

你好,“Average First Token Time over 3 iterations: 0.0907 seconds”,这个token指LLM预测的首speech token吗?
“22.7574 tokens/second” 这个是指flow matching 1s可解码22.7个speech token到mel吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants