Skip to content

debarshee2004/attention-is-all-you-need

Repository files navigation

Recreating "Attention is All You Need"

This repository contains the implementation of the seminal paper "Attention is All You Need" by Vaswani et al. (2017), which introduced the Transformer architecture. The Transformer model revolutionized the field of deep learning by relying entirely on attention mechanisms to draw global dependencies between input and output, leading to state-of-the-art performance on tasks such as machine translation and NLP.

The main goal of this project is to recreate the core ideas of the Transformer model from scratch using PyTorch, with a focus on implementing the scaled dot-product attention and multi-head attention mechanisms. This implementation follows the paper closely, providing insights into the fundamental building blocks of the architecture.


Reference:

Original Paper: Attention is All You Need
Learning Video: Pytorch Transformers from Scratch (Attention is all you need) by Aladdin Persson

About

Research Paper Implementation

Topics

Resources

Stars

Watchers

Forks

Languages