Code not running as it is for st1 due to missing data file creation and wrong file paths #2

ameliesc · 2024-08-16T09:42:31Z

Currently it is not possible to run this code as there are several bugs.

train.py does not work in sst1 for both adapters and transformers as datafiles are loaded which do not exist and are not created anywhere:

semeval2023-multilingual-news-detection/st1/adapters/train.py

Line 45 in 9dab618

train_val_df = load_df('../../data/st1_joined_data/training.tsv')

semeval2023-multilingual-news-detection/st1/adapters/train.py

Line 46 in 9dab618

test_df = load_df('../../data/st1_joined_data/dev.tsv')

semeval2023-multilingual-news-detection/st1/fft/fft_st1.py

Line 48 in 9dab618

training_set=pd.read_csv("./data/cleaned_original_train.tsv", skiprows=1,

semeval2023-multilingual-news-detection/st1/fft/fft_st1.py

Line 51 in 9dab618

satire_en=pd.read_csv("./data/satire_external_en.tsv", skiprows=1,

...

Readme filepath is wrong.

The ReadME for st1 suggest to extract files into ../data/articles/external_satire but process_external_satire.py refers to ../data/external_satire

semeval2023-multilingual-news-detection/st1/process_external_satire.py

Line 19 in 9dab618

satire_files = glob.glob('../data/ext_satire/*.txt')

I suggest cloning this repo and trying to run train.py with the instructions given here.

The text was updated successfully, but these errors were encountered:

freddyheppell · 2024-08-16T11:41:25Z

Hi, thanks for bringing these issues to our attention.

For now, as far as I remember:

the TSVs in the st1_joined_data dir are a data frame of all the article texts along with their ID, language and label.
The satire_external_en file can be produced by using the process_external_satire script to load the data and output a single file.
the cleaned_original_en file is a result of running the functions in clean_text over the files

But we’ll confirm all that and add the missing code as soon as we can.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code not running as it is for st1 due to missing data file creation and wrong file paths #2

Code not running as it is for st1 due to missing data file creation and wrong file paths #2

ameliesc commented Aug 16, 2024

freddyheppell commented Aug 16, 2024

Code not running as it is for st1 due to missing data file creation and wrong file paths #2

Code not running as it is for st1 due to missing data file creation and wrong file paths #2

Comments

ameliesc commented Aug 16, 2024

freddyheppell commented Aug 16, 2024