Recreating "Attention is All You Need"

This repository contains the implementation of the seminal paper "Attention is All You Need" by Vaswani et al. (2017), which introduced the Transformer architecture. The Transformer model revolutionized the field of deep learning by relying entirely on attention mechanisms to draw global dependencies between input and output, leading to state-of-the-art performance on tasks such as machine translation and NLP.

The main goal of this project is to recreate the core ideas of the Transformer model from scratch using PyTorch, with a focus on implementing the scaled dot-product attention and multi-head attention mechanisms. This implementation follows the paper closely, providing insights into the fundamental building blocks of the architecture.

Reference:

Original Paper: Attention is All You Need
Learning Video: Pytorch Transformers from Scratch (Attention is all you need) by Aladdin Persson

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
reports		reports
transformer-modules		transformer-modules
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
transformer-from-scratch.py		transformer-from-scratch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recreating "Attention is All You Need"

About

Languages

debarshee2004/attention-is-all-you-need

Folders and files

Latest commit

History

Repository files navigation

Recreating "Attention is All You Need"

About

Topics

Resources

Stars

Watchers

Forks

Languages