Transformer implementation

My implementation of the transformer architecture from the paper Attention is all you need.

Why an another implementation?

I made this code to learn the basis of pytorch and practice my skill in deep learning. There is a lot of chance that the implementation is wrong so I do not recommend using it, it's just a student project.

What I use to build the project

I used the "Attention is all you need" paper but also a lot of external resources like:

You can take a look on the resource.

Training

You just have to follow these two steps

Install modules: pip install -r requirements.txt
Run the training script: python3 train.py

TODO

Implement learning rate scheduler
Add script to run prediction
Use sentencepiece for tokenization

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
models		models
papers		papers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hyperparameters.py		hyperparameters.py
requirements.txt		requirements.txt
test.py		test.py
tokenizer.py		tokenizer.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer implementation

Why an another implementation?

What I use to build the project

Training

TODO

About

Releases

Packages

Languages

License

Jourdelune/Transformer

Folders and files

Latest commit

History

Repository files navigation

Transformer implementation

Why an another implementation?

What I use to build the project

Training

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages