Skip to content

My implementation of the transformer architecture from the paper "Attention is all you need".

License

Notifications You must be signed in to change notification settings

Jourdelune/Transformer

Repository files navigation

Transformer implementation

My implementation of the transformer architecture from the paper Attention is all you need.

Why an another implementation?

I made this code to learn the basis of pytorch and practice my skill in deep learning. There is a lot of chance that the implementation is wrong so I do not recommend using it, it's just a student project.

What I use to build the project

I used the "Attention is all you need" paper but also a lot of external resources like:

You can take a look on the resource.

Training

You just have to follow these two steps

  1. Install modules: pip install -r requirements.txt

  2. Run the training script: python3 train.py

TODO

  • Implement learning rate scheduler
  • Add script to run prediction
  • Use sentencepiece for tokenization

About

My implementation of the transformer architecture from the paper "Attention is all you need".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages