You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(zyt-nerf) amax@amax:~/zyt/audio2face/EmoTalk_release$ python demo.py --wav_path "./audio/disgust.wav"
Some weights of Wav2Vec2Model were not initialized from the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english and are newly initialized: ['wav2vec2.lm_head.bias', 'wav2vec2.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForSpeechClassification were not initialized from the model checkpoint at r-f/wav2vec-english-speech-emotion-recognition and are newly initialized: ['wav2vec2.lm_head.bias', 'wav2vec2.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/activation.py:1144: UserWarning: Converting mask without torch.bool dtype to bool; this will negatively affect performance. Prefer to use a boolean mask directly. (Triggered internally at ../aten/src/ATen/native/transformers/attention.cpp:150.)
return torch._native_multi_head_attention(
Traceback (most recent call last):
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 111, in
main()
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 106, in main
test(args)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 30, in test
prediction = model.predict(audio, level, person)
File "/home/amax/zyt/audio2face/EmoTalk_release/model.py", line 140, in predict
bs_out11 = self.transformer_decoder(hidden_states11, hidden_states_emo11_832, tgt_mask=tgt_mask11,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 360, in forward
output = mod(output, memory, tgt_mask=tgt_mask,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 698, in forward
x = self.norm1(x + self._sa_block(x, tgt_mask, tgt_key_padding_mask, tgt_is_causal))
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 707, in _sa_block
x = self.self_attn(x, x, x,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 1144, in forward
return torch._native_multi_head_attention(
RuntimeError: Mask shape should match input. mask: [4, 104, 104] input: [1, 4, 104, 104]
The text was updated successfully, but these errors were encountered:
(zyt-nerf) amax@amax:~/zyt/audio2face/EmoTalk_release$ python demo.py --wav_path "./audio/disgust.wav"
Some weights of Wav2Vec2Model were not initialized from the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english and are newly initialized: ['wav2vec2.lm_head.bias', 'wav2vec2.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForSpeechClassification were not initialized from the model checkpoint at r-f/wav2vec-english-speech-emotion-recognition and are newly initialized: ['wav2vec2.lm_head.bias', 'wav2vec2.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/activation.py:1144: UserWarning: Converting mask without torch.bool dtype to bool; this will negatively affect performance. Prefer to use a boolean mask directly. (Triggered internally at ../aten/src/ATen/native/transformers/attention.cpp:150.)
return torch._native_multi_head_attention(
Traceback (most recent call last):
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 111, in
main()
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 106, in main
test(args)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/amax/zyt/audio2face/EmoTalk_release/demo.py", line 30, in test
prediction = model.predict(audio, level, person)
File "/home/amax/zyt/audio2face/EmoTalk_release/model.py", line 140, in predict
bs_out11 = self.transformer_decoder(hidden_states11, hidden_states_emo11_832, tgt_mask=tgt_mask11,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 360, in forward
output = mod(output, memory, tgt_mask=tgt_mask,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 698, in forward
x = self.norm1(x + self._sa_block(x, tgt_mask, tgt_key_padding_mask, tgt_is_causal))
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 707, in _sa_block
x = self.self_attn(x, x, x,
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amax/miniconda3/envs/zyt-nerf/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 1144, in forward
return torch._native_multi_head_attention(
RuntimeError: Mask shape should match input. mask: [4, 104, 104] input: [1, 4, 104, 104]
The text was updated successfully, but these errors were encountered: