Removing Truncation/Padding during Prefix Tokenization #83

tsrikris · 2024-09-06T21:20:17Z

In lwm/vision_generation.py, the max_input_length is capped to 128 characters img_enc, img = generate_first_frame(prompts, max_input_length=128). In this case, any longer prompts provided via 'scripts/run_sample_image.sh' will be truncated to 128 characters. The suggestion here is to update the 'generate_first_frame' function to use the tokenizer's 'longest' padding mode to dynamically set the length instead of padding/truncation, i.e inputs = prefix_tokenizer(prompts, padding='longest',return_tensors='np').

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing Truncation/Padding during Prefix Tokenization #83

Removing Truncation/Padding during Prefix Tokenization #83

tsrikris commented Sep 6, 2024

Removing Truncation/Padding during Prefix Tokenization #83

Removing Truncation/Padding during Prefix Tokenization #83

Comments

tsrikris commented Sep 6, 2024