Possible Bug #1

Hannibal046 · 2023-05-17T04:54:19Z

Hi, thanks for the great work of ICL evaluation in NMT.

I encounter some problems when executing python test/test_flores101.py. Could you please double check this? It seems the code is not fully prepared, e.g. there are some obvious problems from python linter:

https://github.com/OwenNJU/MMT-LLM/blob/36e275dcede8ac0ab4501d7173753e007fec4e3d/test/test_flores101.py#L26-L27

https://github.com/OwenNJU/MMT-LLM/blob/36e275dcede8ac0ab4501d7173753e007fec4e3d/test/test_flores101.py#L77

And there are some problems about Accelerate version as discussed in Shark-NLP/OpenICL#15
Thanks again for this work!

The text was updated successfully, but these errors were encountered:

Lhtie · 2023-06-07T06:43:46Z

Sorry for these mistakes, the evaluation code is updated, along with other typos. Thanks for reminder and feel free to contact whenever anything confusing is spotted. As for the Accelerate version, it is indeed outdated and the newest version is updated in requirements.txt.

Hannibal046 · 2023-06-07T11:08:26Z

Hi, thanks for response! Could you please check this? I am using the updated code.

python test/test_flores101.py \
>   --lang_pair deu-eng \
>   --retriever random \
>   --ice_num 8 \
>   --prompt_template "</E></X>=</Y>" \
>   --model_name facebook/xglm-7.5B \
>   --tokenizer_name facebook/xglm-7.5B \
>   --output_dir output \
>   --output_file test \
>   --seed 43
Namespace(cross_lang=False, direction_order=None, disorder=False, ex_lang=None, ice_num=8, lang_order=None, lang_pair='deu-eng', model_name='facebook/xglm-7.5B', oracle=False, output_dir='output', output_file='test', prompt_template='</E></X>=</Y>', repeat=False, retriever='random', reverse_direction=False, seed=43, tokenizer_name='facebook/xglm-7.5B')
retrieve started
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 41282.52it/s]
retrieve finished
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 146.80it/s]
Average ice num:  8.0
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
  0%|                                                                                                                                                | 0/1 [00:00<?, ?it/s]You're using a XGLMTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/anaconda/envs/mmt/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:2382: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
  warnings.warn(
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 576.93it/s]
Traceback (most recent call last):
  File "test/test_flores101.py", line 122, in <module>
    print(f"BLEU score = {test_flores(args)}")
  File "test/test_flores101.py", line 91, in test_flores
    score = infr.score(src_lang=src, tgt_lang=tgt)
  File "MMT-LLM/openicl/icl_inferencer/icl_base_inferencer.py", line 74, in score
    return self.metric.score(predictions, src_lang=src_lang, tgt_lang=tgt_lang)
  File "MMT-LLM/openicl/icl_evaluator/icl_bleu_evaluator.py", line 25, in score
    pred_dict[idx] = predictions[idx].split()
AttributeError: 'int' object has no attribute 'split'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible Bug #1

Possible Bug #1

Hannibal046 commented May 17, 2023

Lhtie commented Jun 7, 2023

Hannibal046 commented Jun 7, 2023

Possible Bug #1

Possible Bug #1

Comments

Hannibal046 commented May 17, 2023

Lhtie commented Jun 7, 2023

Hannibal046 commented Jun 7, 2023