Having trouble getting run_CLTrain.sh to execute #4

Elfinwang · 2024-11-17T16:07:25Z

I’m having trouble getting the run_CLTrain.sh script to execute.

Where to get 'file_name="data/data_simcse/${train_file}_for_simcse.csv'?
I would appreciate some guidance on recommended parameters for training, such as the number of epochs to use, etc.
Currently, I downloaded the ‘nli_for_simcse.csv’ from 'https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/nltasets-for-simcse/resolve/main/nli_for_simcse.csv', I encountered an error with the following line of code: (src/train.py, line 319)
examples[sent0_cname][idx] = conv_dict(ast.literal_eval(examples[sent0_cname][idx].replace('−inf', '−2e308')))
The error occurs when trying to parse the string with ast.literal_eval.

I would appreciate your help!!!

The text was updated successfully, but these errors were encountered:

LZ12DH · 2024-11-26T06:55:41Z

Hi,

Thanks for the feedback!

The '${train_file}_for_simcse.csv' file is obtained by running 'src/prepare_CL_dataset.py'. Sorry my training data is over 2GB and I could not upload it to the repo. You may run'prepare_CL_dataset.py' using some query triplets to get the files.

For the error you incurred, the reason is that we used a tree based query encoding which is different from plain text in SimCSE. Also in case you have trouble running the above-mentioned code, you may also contact me via email [email protected] and I can share you a small set of training data to see if this error still happens.

Hope this reply clarifies your doubts!

MattCremeens · 2025-01-02T22:04:42Z

@LZ12DH , would you mind putting a sample of the training triplets that can be seen in

dataset_file = '../data/data_simcse/' + dataset + '/training_triplets_' + dataset + '_total.csv'

Is it just an id, a query, a query rewrite that increases efficiency, and a query rewrite that decreases efficiency? How were you able to create such a set of triplets? Did you find them somewhere or make them up?

Elfinwang changed the title ~~Having trouble getting run_CLTrain.sh~~ Having trouble getting run_CLTrain.sh to execute Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Having trouble getting run_CLTrain.sh to execute #4

Having trouble getting run_CLTrain.sh to execute #4

Elfinwang commented Nov 17, 2024

LZ12DH commented Nov 26, 2024

MattCremeens commented Jan 2, 2025 •

edited

Loading

Having trouble getting run_CLTrain.sh to execute #4

Having trouble getting run_CLTrain.sh to execute #4

Comments

Elfinwang commented Nov 17, 2024

LZ12DH commented Nov 26, 2024

MattCremeens commented Jan 2, 2025 • edited Loading

MattCremeens commented Jan 2, 2025 •

edited

Loading