Transformer-based Translation Model

This project implements a transformer-based machine translation model using PyTorch. The model is designed to translate text from one language to another, demonstrating the power of attention mechanisms in sequence-to-sequence tasks.

Features

Implements a full transformer architecture for translation
Includes positional encoding for sequence order preservation
Uses multi-head attention mechanism
Supports customizable model parameters (layers, heads, etc.)
Includes a basic tokenization system

Getting Started

Prerequisites

Python 3.7+
PyTorch 1.7+
NumPy

Installation

Clone the repository:

git clone https://github.com/frost-head/translation.git
cd translation

Install the required packages:
```
pip install torch numpy
```

Downloading Pre-trained Weights

To use the pre-trained weights for this model:

Download the weights from this Google Drive link.
Create a checkpoints folder in the root directory of the project if it doesn't exist:
```
mkdir checkpoints
```
Move the downloaded weight files into the checkpoints folder.

Usage

To train the model:

python main.py

To use the pre-trained weights, make sure they are in the checkpoints folder before running the script.

Model Architecture

The transformer model consists of:

An encoder stack with multiple layers of self-attention and feed-forward networks
A decoder stack with self-attention, encoder-decoder attention, and feed-forward networks
Multi-head attention mechanism for capturing different aspects of the input
Positional encoding to maintain sequence order information

Customization

You can adjust various hyperparameters in main.py to experiment with different model configurations:

d_model: Dimension of the model
nhead: Number of attention heads
num_encoder_layers and num_decoder_layers: Number of layers in encoder and decoder
dim_feedforward: Dimension of the feed-forward network

Contributing

Contributions to improve the model or extend its functionality are welcome. Please feel free to submit pull requests or open issues for any bugs or feature requests.

License

This project is open-source and available under the MIT License.

Acknowledgments

This implementation is inspired by the original "Attention Is All You Need" paper by Vaswani et al.
Thanks to the PyTorch team for their excellent framework and documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
checkpoints		checkpoints
README.md		README.md
current_loss.jpg		current_loss.jpg
hinditokenizer.json		hinditokenizer.json
model.py		model.py
nmodel.py		nmodel.py
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer-based Translation Model

Features

Getting Started

Prerequisites

Installation

Downloading Pre-trained Weights

Usage

Model Architecture

Customization

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

frost-head/translation

Folders and files

Latest commit

History

Repository files navigation

Transformer-based Translation Model

Features

Getting Started

Prerequisites

Installation

Downloading Pre-trained Weights

Usage

Model Architecture

Customization

Contributing

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages