TT-Forge – is Tenstorrent's MLIR-based compiler integrated into various frameworks, AI/ML models from domain-specific compilers to custom kernel generation. We are currently still in developer preview for early adopters to check out what we've built and give it a try.
This repository is the central hub for the TT-Forge compiler project, bringing together its various sub-projects into a cohesive product. Here, you'll find releases, demos, model support, roadmaps, and other key resources as the project evolves. Currently in early developer preview, we’ll be providing frequent updates to keep you informed on the latest developments. Please file any issues with questions or feedback you may have.
See our individual front end documentations in the Front end section to get started running some tests.
- Provide abstraction of many different frontend frameworks
- Generically compile many kinds of model architectures without modification and with good performance
- Abstract all Tenstorrent device architectures
TT-Forge is composed of various projects ranging from Frontends to support popular third-party AI Frameworks, MLIR compiler project, performance optimizations and tools to support the project. TT-Forge lowers to our tt-metalium project providing additional functionality to our AI Sofware ecosystem.
-
- A TVM based graph compiler designed to optimize and transform computational graphs for deep learning models. Supports ingestion of PyTorch, ONNX, TensorFlow, PaddlePaddle and similar ML frameworks via TVM (
tt-tvm
). - See docs pages for an overview and getting started guide.
- A TVM based graph compiler designed to optimize and transform computational graphs for deep learning models. Supports ingestion of PyTorch, ONNX, TensorFlow, PaddlePaddle and similar ML frameworks via TVM (
-
- A MLIR-native, open-source, PyTorch 2.X and torch-mlir based front-end. It provides stableHLO (SHLO) graphs to
tt-mlir
. Supports ingestion of PyTorch models via PT2.X compile and ONNX models via torch-mlir (ONNX->SHLO) - See docs pages for an overview and getting started guide.
- A MLIR-native, open-source, PyTorch 2.X and torch-mlir based front-end. It provides stableHLO (SHLO) graphs to
-
- Leverages a PJRT interface to integrate JAX (and in the future other frameworks),
tt-mlir
and Tenstorrent hardware. Supports ingestion of JAX models via jit compile, providing StableHLO (SHLO) graph tott-mlir
compiler - See README for an overview and getting started guide.
- Leverages a PJRT interface to integrate JAX (and in the future other frameworks),
tt-mlir
project
At it's core tt-mlir
is our compiler that is interfacing with tt-metalium our opens source low level AI Hardware SDK. tt-mlir
provides a solution for optimizing machine learning and other compute workloads for all tenstorrent hardware. tt-mlir
bridges the gap between all different ML Frameworks and Tenstorrent Hardware. tt-mlir
is broken into different dialects:
-
TTIR Dialect - Our common IR that can then be lowered into multiple different backends
-
TTNN Dialect: Our entry point into the TTNN Library of Ops
-
TTMetalium Dialect: Our entry point into directly accessing tt-metalium kernels.
The compiler employs various optimization passes, including layout transformation, operation fusing, decomposition, and sharding, ensuring the efficient lowering to the target dialect.
-
ttmlir-opt – tool is used to run the
tt-mlir
compiler passes on a .mlir source files and is central to developing and testing the compiler. -
ttmlir-translate - ttmlir-translate allows us to ingest something (e.g., code) into MLIR compiler, and produce something (e.g., executable binary, or even code again) from MLIR compiler.
-
ttrt – is a standalone runtime tool that can inspect and run compiler executable files without front-end.
-
tt-explorer - It provides a “Human-In-Loop” interface such that the compiler results can be actively tuned and understood by the person compiling the model.
-
ttnn-standalone - post-compile tuning/debugging tool for C++ TTNN generated code.