Natural language processing course 2022/23: Sentence paraphrasing

Team members:

Nik Pirnat, [email protected]
Martin Bavčar, [email protected]
Anže Glušič, [email protected]

Group public acronym/name: TM9

Enviroment setup

conda create -n nlp-project python=3.8 -c conda-forge
conda activate nlp-project
pip install -r requirements.txt

Preprocessing

Preprocessing the ccKres dataset was done with preprocessing.py.

Back translation

Back-translated dataset was computed using Slovene NMT model with back_translation.py.

Training

Training was run using run_train.py and run_test.py.

Inference

Our models can be downloaded here. Refer to inference.ipynb to run inference on t5-sl-large and t5-sl-small models. Refer to baseline.ipynb for baseline.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
models		models
slurm		slurm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Report.pdf		Report.pdf
back_translation.py		back_translation.py
baseline.ipynb		baseline.ipynb
calc_custom_metric.py		calc_custom_metric.py
custom_dataset.py		custom_dataset.py
exploration.ipynb		exploration.ipynb
inference.ipynb		inference.ipynb
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
run_test.py		run_test.py
run_train.py		run_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural language processing course 2022/23: Sentence paraphrasing

Enviroment setup

Preprocessing

Back translation

Training

Inference

About

Releases

Packages

Contributors 3

Languages

License

UL-FRI-NLP-Course-2022-23/nlp-course-team-9

Folders and files

Latest commit

History

Repository files navigation

Natural language processing course 2022/23: Sentence paraphrasing

Enviroment setup

Preprocessing

Back translation

Training

Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages