Add a first example for the new Dlib layers to build a transform-type network #3041

Cydral · 2025-01-07T12:10:02Z

This example demonstrates a minimal implementation of a Very Small Language Model (VSLM) using Dlib's Transformer architecture.
The code showcases key features of the new Transformer layers, including attention mechanisms, positional embeddings, and a classification head, while maintaining a simple character-based tokenization approach.

Using Shakespeare's text as training data, the example illustrates both the training process and text generation capabilities, making it an excellent educational tool for understanding Transformer architecture basics.
The implementation is intentionally kept lightweight with a small parameter count to ensure quick training and generation while still achieving perfect memorization of training sequences, demonstrating the effectiveness of attention mechanisms in sequence learning tasks.

…sformer-type network.

Add a first example of how to use the new Dlib layers to build a Tran…

1f7f068

…sformer-type network.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a first example for the new Dlib layers to build a transform-type network #3041

Add a first example for the new Dlib layers to build a transform-type network #3041

Cydral commented Jan 7, 2025

Add a first example for the new Dlib layers to build a transform-type network #3041

Are you sure you want to change the base?

Add a first example for the new Dlib layers to build a transform-type network #3041

Conversation

Cydral commented Jan 7, 2025