The transformer architecture introduced in this paper implemented from scratch using pytorch out of pure curiosity for how it works.
If you'd like to try it out, clone this repository
Install poetry for dependency management
Change into the root directory of this project and then run
poetry shell
Then, from the same directory, run:
python main.py
This would use some predefined tokens in the main.py
file as input to the transformer network and print out the output from the network.
I plan to implement a full generative model. I'll update this Readme once I do so.
- Attention is all you need.
- Transformers from scratch by Peter Bloem which I took so much delight in reading.
- Transformers from scratch by Brandon Roher, another lovely article.