Paraphrase Detection with BiMPM

Description

This repository contains the PyTorch implementation of the Bilateral Multi-perspective Matching model BiMPM described in the paper by Wang et al. The model is used to perform a paraphrase detection task on the Quora Questions Pairs dataset. In order to maintain consistency in comparison, we adopted the train/dev/test partition by Wang et al. The program takes two phrases as inputs and predicts a value to indicate if the two phrases are paraphrases of each other or not.

Requirements

python 3.5
torch 0.1.12

Train

To train the model using the setting described in the paper, run

python trainer.py --embedding wordvec.txt --data quora_data/ --word-len 15 --seq-len 50 --perspectives 5 --batch-size 32 --cuda

Test

To test a model on the test dataset, run

python test.py --embedding wordvec.txt --data quora_data/test.tsv --word-len 15 --seq-len 50 --perspectives 5 --batch-size 32 --model model.pth

Issues

Please report any issues to me [email protected].

Reference

Zhiguo Wang, Wael Hamza, Radu Florian. Bilateral Multi-Perspective Matching for Natural Language Sentences. IJCAI (2017)
Quora Question Pairs Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
model		model
BiMPM.png		BiMPM.png
README.md		README.md
data_loader.py		data_loader.py
test.py		test.py
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paraphrase Detection with BiMPM

Description

Requirements

Train

Test

Issues

Reference

About

Releases

Packages

Languages

timatim/BiMPM

Folders and files

Latest commit

History

Repository files navigation

Paraphrase Detection with BiMPM

Description

Requirements

Train

Test

Issues

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages