Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during fine-tuning #83

Open
utnasun opened this issue Dec 11, 2022 · 4 comments
Open

Error during fine-tuning #83

utnasun opened this issue Dec 11, 2022 · 4 comments

Comments

@utnasun
Copy link

utnasun commented Dec 11, 2022

I have code:

from huggingsound import TrainingArguments, ModelArguments, SpeechRecognitionModel, TokenSet
from transformers import Wav2Vec2Processor

processor_ref = Wav2Vec2Processor.from_pretrained("/my/dir/wav2vec2-large-xlsr-53-kalmyk")
token_list = list(processor_ref.tokenizer.encoder.keys())
print(len(token_list))

model = SpeechRecognitionModel("/my/dir/wav2vec2-large-xlsr-53-kalmyk")
output_dir = "/my/dir/tuned"

token_set = TokenSet(token_list)

model.finetune(
    output_dir, 
    train_data=train_data,
    token_set=token_set
)

I have list of dicts like this in my train_data:

train_data = [
    {"path": "/path/to/sagan.mp3", "transcription": "extraordinary claims require extraordinary evidence"},
    {"path": "/path/to/asimov.wav", "transcription": "violence is the last refuge of the incompetent"},
]

Then i get some errors. Can someone help me with that?

	size mismatch for lm_head.weight: copying a param with shape torch.Size([41, 1024]) from checkpoint, the shape in current model is torch.Size([45, 1024]).
	size mismatch for lm_head.bias: copying a param with shape torch.Size([41]) from checkpoint, the shape in current model is torch.Size([45]).
	You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method. ```
@wanghh2000
Copy link

Same issue found from 0.1.6

09/14/2023 13:58:05 - INFO - huggingsound.trainer - Getting dataset stats...
09/14/2023 13:58:05 - INFO - huggingsound.trainer - Training dataset size: 4 samples, 0.006265000000000001 hours
09/14/2023 13:58:05 - INFO - huggingsound.trainer - Evaluation dataset size: 2 samples, 0.004876649305555555 hours
Traceback (most recent call last):
  File "finetune.py", line 45, in <module>
    model.finetune(
  File "C:\Users\.conda\envs\wbbbbb\lib\site-packages\huggingsound\speech_recognition\model.py", line 361, in finetune
    finetune_ctc(self.model_path, output_dir, processor, train_dataset, eval_dataset, self.device, training_args, model_args)
  File "C:\Users\.conda\envs\wbbbbb\lib\site-packages\huggingsound\trainer.py", line 586, in finetune_ctc
    model = AutoModelForCTC.from_pretrained(model_name_or_path, config=config)
  File "C:\Users\.conda\envs\wbbbbb\lib\site-packages\transformers\models\auto\auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "C:\Users\.conda\envs\wbbbbb\lib\site-packages\transformers\modeling_utils.py", line 3175, in from_pretrained
    ) = cls._load_pretrained_model(
  File "C:\Users\.conda\envs\wbbbbb\lib\site-packages\transformers\modeling_utils.py", line 3624, in _load_pretrained_model
    raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for Wav2Vec2ForCTC:
        size mismatch for lm_head.weight: copying a param with shape torch.Size([5171, 1024]) from checkpoint, the shape in current model is torch.Size([5173, 1024]).
        size mismatch for lm_head.bias: copying a param with shape torch.Size([5171]) from checkpoint, the shape in current model is torch.Size([5173]).
        You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

@wanghh2000
Copy link

wanghh2000 commented Sep 15, 2023

Looks like it depend on model, error disappeared after change another model.

@SteveSZF
Copy link

Looks like it depend on model, error disappeared after change another model.

hi, which model do you use to solve this problem?

@wanghh2000
Copy link

wanghh2000 commented Feb 15, 2024

@SteveSZF https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn

or edit vocab.json to add "<s>" and "</s>", make word array length == 5173.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants