NMT system with sub-word modelling and CNN-based embeddings

This repository contains implementation of sequence-to-sequence model with attention for Neural Machine Translation task. In Machine Translation, our goal is to convert a sentence from the source language (e.g. Spanish) to the target language (e.g. English)

The implementation is part of assignment #5 for Stanford's CS224n: Natural Language Processing with Deep Learning.

Assignment handout: https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1194/assignments/a5.pdf

Most work is concentrated in:

highway.py - Highway network for the encoder.
cnn.py - Convolution network for the encoder.
vocab.py - Vocabulary class, taking care of tokenization and pre-processing.
model_embeddings.py - torch.Module sub-class responsible for producing CNN-based embeddings. Training code isn't contained here.
nmt_model.py - implements the entire encoder-decoder architecture.
char_decoder.py - implements character decoder.

See implementation requirements in handout/a5.pdf

Installation and training

Create a new conda environment: conda env create --file local_env.yml
Generate vocabulary: sh run.sh vocab
If training on GPU, install
Training

For local development (trained on CPU)

sh run.sh train_local_q2

sh run.sh test_local_q2

For training on GPU:

Install additional packages:

pip install -r gpu_requirements.txt

Run training:

sh run.sh train

Encoder Architecture

Consists of the following steps:

Converting word to char indices
Padding and embedding lookup
Convolutional network, summarising results using max-pooling.
Highway layer and dropout - providing skip-connection controlled by a dynamic gate (learned parameter)

Decoder Architecture

Consists of the following steps:

Forward computation of Character Decoder
Training of Character Decoder.
Decoding from the Character Decoder

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
handout		handout
images		images
outputs		outputs
sanity_check_en_es_data		sanity_check_en_es_data
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
char_decoder.py		char_decoder.py
cnn.py		cnn.py
collect_submission.sh		collect_submission.sh
gpu_requirements.txt		gpu_requirements.txt
highway.py		highway.py
local_env.yml		local_env.yml
model.bin		model.bin
model.bin.optim		model.bin.optim
model_embeddings.py		model_embeddings.py
nmt_model.py		nmt_model.py
run.py		run.py
run.sh		run.sh
sanity_check.py		sanity_check.py
utils.py		utils.py
vocab.json		vocab.json
vocab.py		vocab.py
vocab_tiny_q1.json		vocab_tiny_q1.json
vocab_tiny_q2.json		vocab_tiny_q2.json
written_part.md		written_part.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NMT system with sub-word modelling and CNN-based embeddings

Installation and training

For local development (trained on CPU)

For training on GPU:

Encoder Architecture

Decoder Architecture

About

Releases

Packages

Languages

snexus/nmt-subword-cnn

Folders and files

Latest commit

History

Repository files navigation

NMT system with sub-word modelling and CNN-based embeddings

Installation and training

For local development (trained on CPU)

For training on GPU:

Encoder Architecture

Decoder Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages