You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The '${train_file}_for_simcse.csv' file is obtained by running 'src/prepare_CL_dataset.py'. Sorry my training data is over 2GB and I could not upload it to the repo. You may run'prepare_CL_dataset.py' using some query triplets to get the files.
For the error you incurred, the reason is that we used a tree based query encoding which is different from plain text in SimCSE. Also in case you have trouble running the above-mentioned code, you may also contact me via email [email protected] and I can share you a small set of training data to see if this error still happens.
Is it just an id, a query, a query rewrite that increases efficiency, and a query rewrite that decreases efficiency? How were you able to create such a set of triplets? Did you find them somewhere or make them up?
I’m having trouble getting the run_CLTrain.sh script to execute.
examples[sent0_cname][idx] = conv_dict(ast.literal_eval(examples[sent0_cname][idx].replace('−inf', '−2e308')))
The error occurs when trying to parse the string with ast.literal_eval.
I would appreciate your help!!!
The text was updated successfully, but these errors were encountered: