We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speech tokenizer输出的token数量是16384个 可是GLM输入的音频token只有16383个 这是个bug?
The text was updated successfully, but these errors were encountered:
有示例吗?
Sorry, something went wrong.
speech tokenizer 支持输出的 audio tokens 数为 16384个,且存在音频会被 tokenizer 为包含最后一个 audio token, 但是 GLM 添加的 audio tokens 只到 <|audio_16382|>,少一个。
音频示例:WenetSpeech/audio/train/youtube/B00000/Y0000000009_-0p8pYdlfjY.opus 中的一段 torchaudio 读取参数:frame_offset=88405920, num_frames=1920000
上述行为是由于最后一个 audio token 的利用率很低可以弃用吗?
No branches or pull requests
speech tokenizer输出的token数量是16384个 可是GLM输入的音频token只有16383个 这是个bug?
The text was updated successfully, but these errors were encountered: