diff --git a/.github/FUNDING.yml b/.github/FUNDING.yml deleted file mode 100644 index 08ac23b5fc..0000000000 --- a/.github/FUNDING.yml +++ /dev/null @@ -1 +0,0 @@ -github: [nathanielsimard] diff --git a/README.md b/README.md index 7ca5fb3e58..f3cf72ddc1 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@
- + [![Discord](https://img.shields.io/discord/1038839012602941528.svg?color=7289da&&logo=discord)](https://discord.gg/uPEBbYYDB6) [![Current Crates.io Version](https://img.shields.io/crates/v/burn.svg)](https://crates.io/crates/burn) @@ -9,46 +9,417 @@ [![Rust Version](https://img.shields.io/badge/Rust-1.71.0+-blue)](https://releases.rs/docs/1.71.0) ![license](https://shields.io/badge/license-MIT%2FApache--2.0-blue) -This library strives to serve as a comprehensive **deep learning framework**, offering exceptional -flexibility and written in Rust. Our objective is to cater to both researchers and practitioners by -simplifying the process of experimenting, training, and deploying models. +--- + +**Burn is a new comprehensive dynamic Deep Learning Framework built using Rust
with extreme flexibility, compute efficiency and portability as its primary goals.** + +
+
-## Features +## Performance -- Customizable, intuitive and user-friendly neural network [module](https://burn-rs.github.io/book/building-blocks/module.html) 🔥 -- Comprehensive [training](https://burn-rs.github.io/book/building-blocks/learner.html) tools, including `metrics`, `logging`, and `checkpointing` - 📈 -- Versatile [Tensor](https://burn-rs.github.io/book/building-blocks/tensor.html) crate equipped with pluggable backends 🔧 - - [Torch](https://github.com/burn-rs/burn/tree/main/burn-tch) backend, supporting both CPU and GPU - 🚀 - - [Ndarray](https://github.com/burn-rs/burn/tree/main/burn-ndarray) backend with - [`no_std`](#support-for-no_std) compatibility, ensuring universal platform adaptability 👌 - - [WebGPU](https://github.com/burn-rs/burn/tree/main/burn-wgpu) backend, offering cross-platform, - browser-inclusive, GPU-based computations 🌐 - - [Candle](https://github.com/burn-rs/burn/tree/main/burn-candle) backend 🕯️ - - [Autodiff](https://github.com/burn-rs/burn/tree/main/burn-autodiff) backend that enables - differentiability across all backends 🌟 -- [Dataset](https://github.com/burn-rs/burn/tree/main/burn-dataset) crate containing a diverse range - of utilities and sources 📚 -- [Import](https://github.com/burn-rs/burn/tree/main/burn-import) crate that simplifies the - integration of pretrained models 📦 +
+ -## Get Started +Because we believe the goal of a deep learning framework is to convert computation into useful intelligence, we have made performance a core pillar of Burn. +We strive to achieve top efficiency by leveraging multiple optimization techniques described below. -### The Burn Book 🔥 +**Click on each section for more details** 👇 -To begin working effectively with `burn`, it is crucial to understand its key components and philosophy. -For detailed examples and explanations covering every facet of the framework, please refer to [The Burn Book 🔥](https://burn-rs.github.io/book/). +
-### Pre-trained Models +
-We keep an updated and curated list of models and examples built with Burn, see the [burn-rs/models](https://github.com/burn-rs/models) repository for more details. +
+ +Automatic kernel fusion 💥 + +
-### Examples +Using Burn means having your models optimized on any backend. +When possible, we provide a way to automatically and dynamically create custom kernels that minimize data relocation between different memory spaces, extremely useful when moving memory is the bottleneck. -Here is a code snippet showing how intuitive the framework is to use, where we declare a position-wise feed-forward module along with its forward pass. +As an example, you could write your own GELU activation function with the high level tensor api (see Rust code snippet below). + +```rust +fn gelu_custom(x: Tensor) -> Tensor { + let x = x.clone() * ((x / SQRT_2).erf() + 1); + x / 2 +} +``` + +Then, at runtime, a custom low-level kernel will be automatically created for your specific implementation and will rival a handcrafted GPU implementation. The kernel consists of about 60 lines of WGSL [WebGPU Shading Language]("https://www.w3.org/TR/WGSL/https://www.w3.org/TR/WGSL/"), an extremely verbose lower level shader language you probably don't want to program your deep learning models in! + +> As of now, our fusion strategy is only implemented for our own WGPU backend and supports only a subset of operations. +> We plan to add more operations very soon and extend this technique to other future in-house backends. + +
+ +
+ +Asynchronous execution ❤️‍🔥 + +
+ +For [backends developed from scratch by the Burn team](#backends), an asynchronous execution style is used, which allows to perform various optimizations, such as the previously mentioned automatic kernel fusion. + +Asynchronous execution also ensures that the normal execution of the framework does not block the model computations, which implies that the framework overhead won't impact the speed of execution significantly. +Conversely, the intense computations in the model do not interfere with the responsiveness of the framework. +For more information about our asynchronous backends, see [this blog post](https://burn.dev/blog/creating-high-performance-asynchronous-backends-with-burn-compute). + +
+ +
+ +Thread-safe building blocks 🦞 + +
+ +Burn emphasizes thread safety by leveraging the [ownership system of Rust](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html). +With Burn, each module is the owner of its weights. It is therefore possible to send a module to another thread for computing the gradients, then send the gradients to the main thread that can aggregate them, and _voilà_, you get multi-device training. + +This is a very different approach from what PyTorch does, where backpropagation actually mutates the _grad_ attribute of each tensor parameter. +This is not a thread-safe operation and therefore requires lower level synchronization primitives, see [distributed training](https://pytorch.org/docs/stable/distributed.html) for reference. +Note that this is still very fast, but not compatible across different backends and quite hard to implement. + +
+ +
+ +Intelligent memory management 🦀 + +
+ +One of the main roles of a deep learning framework is to reduce the amount of memory necessary to run models. +The naive way of handling memory is that each tensor has its own memory space, which is allocated when the tensor is created then deallocated as the tensor gets out of scope. +However, allocating and deallocating data is very costly, so a memory pool is often required to achieve good throughput. +Burn offers an infrastructure that allows for easily creating and selecting memory management strategies for backends. +For more details on memory management in Burn, see [this blog post](https://burn.dev/blog/creating-high-performance-asynchronous-backends-with-burn-compute). + +Another very important memory optimization of Burn is that we keep track of when a tensor can be mutated in-place just by using the ownership system well. +Even though it is a rather small memory optimization on its own, it adds up considerably when training or running inference with larger models and contributes to reduce the memory usage even more. +For more information, see [this blog post about tensor handling](https://burn.dev/blog/burn-rusty-approach-to-tensor-handling). + +
+ +
+ +Automatic kernel selection 🎯 + +
+ +A good deep learning framework should ensure that models run smoothly on all hardware. +However, not all hardware share the same behavior in terms of execution speed. +For instance, a matrix multiplication kernel can be launched with many different parameters, which are highly sensitive to the size of the matrices and the hardware. +Using the wrong configuration could reduce the speed of execution by a large factor (10 times or even more in extreme cases), so choosing the right kernels becomes a priority. + +With our home-made backends, we run benchmarks automatically and choose the best configuration for the current hardware and matrix sizes with a reasonable caching strategy. + +This adds a small overhead by increasing the warmup execution time, but stabilizes quickly after a few forward and backward passes, saving lots of time in the long run. +Note that this feature isn't mandatory, and can be disabled when cold starts are a priority over optimized throughput. + +
+ +
+ +Hardware specific features 🔥 + +
+ +It is no secret that deep learning is mosly relying on matrix multiplication as its core operation, since this is how fully-connected neural networks are modeled. + +More and more, hardware manufacturers optimize their chips specifically for matrix mutiliplication workloads. +For instance, Nvidia has its _Tensor Cores_ and today most cellphones have AI specialized chips. +As of this moment, we support Tensor Cores with our LibTorch and Candle backends, but not other accelerators yet. +We hope [this issue](https://github.com/gpuweb/gpuweb/issues/4195) gets resolved at some point to bring support to our WGPU backend. + +
+ +
+ +Custom Backend Extension 🎒 + +
+ +Burn aims to be the most flexible deep learning framework. +While it's crucial to maintain compatibility with a wide variety of backends, Burn also provides the ability to extend the functionalities of a backend implementation to suit your personal modeling requirements. + +This versatility is advantageous in numerous ways, such as supporting custom operations like flash attention or manually writing your own kernel for a specific backend to enhance performance. +See [this section](https://burn.dev/book/advanced/backend-extension/index.html) in the Burn Book 🔥 for more details. + +
+ +
+ +## Training & Inference + +
+ + +The whole deep learning workflow is made easy with Burn, as you can monitor your training progress with an ergonomic dashboard, and run inference everywhere from embedded devices to large GPU clusters. + +Burn was built from the ground up with training and inference in mind. It's also worth noting how Burn, in comparison to frameworks like PyTorch, simplifies the transition from training to deployment, eliminating the need for code changes. + +
+ +
+ +
+ + + Burn Train TUI + +
+ +
+ +**Click on the following sections to expand 👇** + +
+ +Training Dashboard 📈 + +
+ +As you can see in the previous video (click on the picture!), a new terminal UI dashboard based on the [Ratatui](https://github.com/ratatui-org/ratatui) crate allows users to follow their training with ease without having to connect to any external application. + +You can visualize your training and validation metrics updating in real-time and analyze the lifelong progression or recent history of any registered metrics using only the arrow keys. +Break from the training loop without crashing, allowing potential checkpoints to be fully written or important pieces of code to complete without interruption 🛡 + +
+ +
+ +ONNX Support 🍬 + +
+ +ONNX (Open Neural Network Exchange) is an open-standard format that exports both the architecture and the weights of a deep learning model. + +Burn supports the importation of models that follow the ONNX standard so you can easily port a model you have written in another framework like TensorFlow or PyTorch to Burn to benefit from all the advantages our framework offers. + +Our ONNX support is further described in [this section of the Burn Book 🔥](https://burn.dev/book/import/onnx-model.html). + +> **Note**: This crate is in active development and currently supports a +> [limited set of ONNX operators](./burn-import/SUPPORTED-ONNX-OPS.md). + +
+ +
+ +Inference in the Browser 🌐 + +
+ +Several of our backends can compile to Web Assembly: Candle and NdArray for CPU, and WGPU for GPU. This means that you can run inference directly within a browser. +We provide several examples of this: + +- [MNIST](./examples/mnist-inference-web) where you can draw digits and a small convnet tries to find which one it is! 2️⃣ 7️⃣ 😰 +- [Image Classification](./examples/image-classification-web) where you can upload images and classify them! 🌄 + +
+ +
+ +Embedded: no_std support ⚙️ + +
+ +Burn's core components support [no_std](https://docs.rust-embedded.org/book/intro/no-std.html). This means it can run in bare metal environment such as embedded devices without an operating system. + +> As of now, only the NdArray backend can be used in a _no_std_ environment. + +
+ +
+ +## Backends + +
+ +Burn strives to be as fast as possible on as many hardwares as possible, with robust implementations. +We believe this flexibility is crucial for modern needs where you may train your models in the cloud, then deploy on customer hardwares, which vary from user to user. +
+ +
+ +Compared to other frameworks, Burn has a very different approach to supporting many backends. +By design, most code is generic over the Backend trait, which allows us to build Burn with swappable backends. +This makes composing backend possible, augmenting them with additional functionalities such as autodifferentiation and automatic kernel fusion. + +**We already have many backends implemented, all listed below 👇** + +
+ +WGPU (WebGPU): Cross-Platform GPU Backend 🌐 + +
+ +**The go-to backend for running on any GPU.** + +Based on the most popular and well-supported Rust graphics library, [WGPU](https://wgpu.rs), this backend automatically targets Vulkan, OpenGL, Metal, Direct X11/12, and WebGPU, by using the WebGPU shading language [WGSL](https://www.w3.org/TR/WGSL/https://www.w3.org/TR/WGSL/). +It can also be compiled to Web Assembly to run in the browser while leveraging the GPU, see [this demo](https://antimora.github.io/image-classification/). +For more information on the benefits of this backend, see [this blog](https://burn.dev/blog/cross-platform-gpu-backend). + +The WGPU backend is our first "in-house backend", which means we have complete control over its implementation details. +It is fully optimized with the [performance characteristics mentioned earlier](#performance), as it serves as our research playgound for a variety of optimizations. + +See the [WGPU Backend README](./burn-wgpu/README.md) for more details. + +
+ +
+ +Candle: Backend using the Candle bindings 🕯 + +
+ +Based on [Candle by Hugging Face](https://github.com/huggingface/candle), a minimalist ML framework for Rust with a focus on performance and ease of use, this backend can run on CPU with support for Web Assembly or on Nvidia GPUs using CUDA. + +See the [Candle Backend README](./burn-candle/README.md) for more details. + +> _Disclaimer:_ This backend is not fully completed yet, but can work in some contexts like inference. + +
+ +
+ +LibTorch: Backend using the LibTorch bindings 🎆 + +
+ +PyTorch doesn't need an introduction in the realm of deep learning. +This backend leverages [PyTorch Rust bindings](https://github.com/LaurentMazare/tch-rs), enabling you to use LibTorch C++ kernels on CPU, CUDA and Metal. + +See the [LibTorch Backend README](./burn-tch/README.md) for more details. + +
+ +
+ +NdArray: Backend using the NdArray primitive as data structure 🦐 + +
+ +This CPU backend is admittedly not our fastest backend, but offers extreme portability. + +It is our only backend supporting _no_std_. + +See the [NdArray Backend README](./burn-ndarray/README.md) for more details. + +
+ +
+ +Autodiff: Backend decorator that brings backpropagation to any backend 🔄 + +
+ +Contrary to the aforementioned backends, Autodiff is actually a backend _decorator_. +This means that it cannot exist by itself; it must encapsulate another backend. + +The simple act of wrapping a base backend with Autodiff transparently equips it with autodifferentiation support, making it possible to call backward on your model. + +```rust +use burn::backend::{Autodiff, Wgpu}; +use burn::tensor::{Distribution, Tensor}; + +fn main() { + type Backend = Autodiff; + + let x: Tensor = Tensor::random([32, 32], Distribution::Default); + let y: Tensor = Tensor::random([32, 32], Distribution::Default).require_grad(); + + let tmp = x.clone() + y.clone(); + let tmp = tmp.matmul(x); + let tmp = tmp.exp(); + + let grads = tmp.backward(); + let y_grad = y.grad(&grads).unwrap(); + println!("{y_grad}"); +} +``` + +Of note, it is impossible to make the mistake of calling backward on a model that runs on a backend that does not support autodiff (for inference), as this method is only offered by an Autodiff backend. + +See the [Autodiff Backend README](./burn-autodiff/README.md) for more details. + +
+ +
+ +Fusion: Backend decorator that brings kernel fusion to backends that support it 💥 + +
+ +This backend decorator enhances a backend with kernel fusion, provided that the inner backend supports it. +Note that you can compose this backend with other backend decorators such as Autodiff. +For now, only the WGPU backend has support for fused kernels. + +```rust +use burn::backend::{Autodiff, Fusion, Wgpu}; +use burn::tensor::{Distribution, Tensor}; + +fn main() { + type Backend = Autodiff>; + + let x: Tensor = Tensor::random([32, 32], Distribution::Default); + let y: Tensor = Tensor::random([32, 32], Distribution::Default).require_grad(); + + let tmp = x.clone() + y.clone(); + let tmp = tmp.matmul(x); + let tmp = tmp.exp(); + + let grads = tmp.backward(); + let y_grad = y.grad(&grads).unwrap(); + println!("{y_grad}"); +} + +``` + +Of note, we plan to implement automatic gradient checkpointing based on compute bound and memory bound operations, which will work gracefully with the fusion backend to make your code run even faster during training, see [this issue](https://github.com/burn-rs/burn/issues/936). + +See the [Fusion Backend README](./burn-fusion/README.md) for more details. + +
+ +
+ +## Getting Started + +
+ + +Just heard of Burn? You are at the right place! Just continue reading this section and we hope you can get on board really quickly. + +
+ +
+ +The Burn Book 🔥 + +
+ +To begin working effectively with Burn, it is crucial to understand its key components and philosophy. +This is why we highly recommend new users to read the first sections of [The Burn Book 🔥](https://burn.dev/book/). +It provides detailed examples and explanations covering every facet of the framework, including building blocks like tensors, modules, and optimizers, all the way to advanced usage, like coding your own GPU kernels. + +> The project is constantly evolving, and we try as much as possible to keep the book up to date with new additions. +> However, we might miss some details sometimes, so if you see something weird, let us know! +> We also gladly accept Pull Requests 😄 + +
+ +
+ +Examples 🙏 + +
+ +Let's start with a code snippet that shows how intuitive the framework is to use! +In the following, we declare a neural network module with some parameters along with its forward pass. ```rust use burn::nn; @@ -57,10 +428,10 @@ use burn::tensor::backend::Backend; #[derive(Module, Debug)] pub struct PositionWiseFeedForward { - linear_inner: Linear, - linear_outer: Linear, - dropout: Dropout, - gelu: GELU, + linear_inner: nn::Linear, + linear_outer: nn::Linear, + dropout: nn::Dropout, + gelu: nn::GELU, } impl PositionWiseFeedForward { @@ -74,89 +445,79 @@ impl PositionWiseFeedForward { } ``` -For more practical insights, you can clone the repository and experiment with the following examples: +We have a somewhat large amount of [examples](./examples) in the repository that shows how to use the framework in different scenarios. +For more practical insights, you can clone the repository and run any of them directly on your computer! -- [MNIST](https://github.com/burn-rs/burn/tree/main/examples/mnist) train a model on CPU/GPU using - different backends. -- [MNIST Inference Web](https://github.com/burn-rs/burn/tree/main/examples/mnist-inference-web) run - trained model in the browser for inference. -- [Text Classification](https://github.com/burn-rs/burn/tree/main/examples/text-classification) - train a transformer encoder from scratch on GPU. -- [Text Generation](https://github.com/burn-rs/burn/tree/main/examples/text-generation) train an - autoregressive transformer from scratch on GPU. +
-## Supported Platforms +
+ +Pre-trained Models 🤖 + +
-### [Burn-ndarray][1] Backend +We keep an updated and curated list of models and examples built with Burn, see the [burn-rs/models repository](https://github.com/burn-rs/models) for more details. -| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | -| :--------- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | -| Pure Rust | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | -| Accelerate | Yes | No | No | Yes | No | No | Yes | No | -| Netlib | Yes | No | Yes | Yes | Yes | No | No | No | -| Openblas | Yes | No | Yes | Yes | Yes | Yes | Yes | No | +Don't see the model you want? Don't hesitate to open an issue, and we may prioritize it. +Built a model using Burn and want to share it? +You can also open a Pull Request and add your model under the community section! -### [Burn-tch][2] Backend +
-| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | -| :----- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | -| CPU | Yes | No | Yes | Yes | Yes | Yes | Yes | No | -| CUDA | No | Yes | Yes | No | Yes | No | No | No | -| MPS | No | Yes | No | Yes | No | No | No | No | -| Vulkan | Yes | Yes | Yes | Yes | Yes | Yes | No | No | +
+ +Why use Rust for Deep Learning? 🦀 + +
-### [Burn-wgpu][3] Backend +Deep Learning is a special form of software where you need very high level abstractions as well as extremely fast execution time. +Rust is the perfect candidate for that use case since it provides zero-cost abstractions to easily create neural network modules, and fine-grained control over memory to optimize every detail. -| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | -| :-------- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | -| Metal | No | Yes | No | Yes | No | No | Yes | No | -| Vulkan | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | -| OpenGL | No | Yes | Yes | Yes | Yes | Yes | Yes | No | -| WebGpu | No | Yes | No | No | No | No | No | Yes | -| Dx11/Dx12 | No | Yes | No | No | Yes | No | No | No | +It's important that a framework be easy to use at a high level so that its users can concentrate on innovating in the AI field. +However, since running models relies so heavily on computations, performance can't be neglected. -[1]: https://github.com/burn-rs/burn/tree/main/burn-ndarray -[2]: https://github.com/burn-rs/burn/tree/main/burn-tch -[3]: https://github.com/burn-rs/burn/tree/main/burn-wgpu +To this day, the mainstream solution to this problem has been to offer APIs in Python, but rely on bindings to low-level languages such as C/C++. +This reduces portability, increases complexity and creates frictions between researchers and engineers. +We feel like Rust's approach to abstractions makes it versatile enough to tackle this two languages dichotomy. -## Support for `no_std` +Rust also comes with the Cargo package manager, which makes it incredibly easy to build, test, and deploy from any environment, which is usually a pain in Python. -Burn, including its `burn-ndarray` backend, can work in a `no_std` environment, provided `alloc` is -available for the inference mode. To accomplish this, simply turn off the default features in `burn` -and `burn-ndarray` (which is the minimum requirement for running the inference mode). You can find a -reference example in -[burn-no-std-tests](https://github.com/burn-rs/burn/tree/main/burn-no-std-tests). +Although Rust has the reputation of being a difficult language at first, we strongly believe it leads to more reliable, bug-free solutions built faster (after some practice 😅)! -The `burn-core` and `burn-tensor` crates also support `no_std` with `alloc`. These crates can be -directly added as dependencies if necessary, as they are reexported by the `burn` crate. +
-Please be aware that when using the `no_std` mode, a random seed will be generated at build time if -one hasn't been set using the `Backend::seed` method. Also, the -[spin::mutex::Mutex](https://docs.rs/spin/latest/spin/mutex/struct.Mutex.html) is used instead of -[std::sync::Mutex](https://doc.rust-lang.org/std/sync/struct.Mutex.html) in this mode. +
-## Contributing +## Community -Before contributing, please take a moment to review our -[code of conduct](https://github.com/burn-rs/burn/tree/main/CODE-OF-CONDUCT.md). It's also highly -recommended to read our -[architecture document](https://github.com/burn-rs/burn/tree/main/ARCHITECTURE.md), which explains -our architectural decisions. Please see more details in our [contributing guide](/CONTRIBUTING.md). +
+ -## Disclaimer +If you are excited about the project, don't hesitate to join our [Discord](https://discord.gg/PbjzCPfs)! +We try to be as welcoming as possible to everybody from any background. +You can ask your questions and share what you built with the community! -Burn is currently in active development, and there will be breaking changes. While any resulting -issues are likely to be easy to fix, there are no guarantees at this stage. +
-## Sponsors +
-Thanks to all current sponsors 🙏. +**Contributing** -smallstepman -premAI-io +Before contributing, please take a moment to review our +[code of conduct](https://github.com/burn-rs/burn/tree/main/CODE-OF-CONDUCT.md). +It's also highly recommended to read our +[architecture document](https://github.com/burn-rs/burn/tree/main/ARCHITECTURE.md), which explains some of our architectural decisions. +Refer to out [contributing guide](/CONTRIBUTING.md) for more details. + +## Status + +Burn is currently in active development, and there will be breaking changes. +While any resulting issues are likely to be easy to fix, there are no guarantees at this stage. ## License Burn is distributed under the terms of both the MIT license and the Apache License (Version 2.0). See [LICENSE-APACHE](./LICENSE-APACHE) and [LICENSE-MIT](./LICENSE-MIT) for details. Opening a pull request is assumed to signal agreement with these licensing terms. + +
diff --git a/_typos.toml b/_typos.toml index 6c3ee82a6c..0d002ff695 100644 --- a/_typos.toml +++ b/_typos.toml @@ -1,5 +1,5 @@ [default] -extend-ignore-identifiers-re = ["ratatui", "NdArray*", "ND"] +extend-ignore-identifiers-re = ["ratatui", "Ratatui", "NdArray*", "ND"] [files] extend-exclude = [ diff --git a/assets/backend-chip.png b/assets/backend-chip.png new file mode 100644 index 0000000000..c0282aead3 Binary files /dev/null and b/assets/backend-chip.png differ diff --git a/assets/burn-train-tui.png b/assets/burn-train-tui.png new file mode 100644 index 0000000000..817e21b008 Binary files /dev/null and b/assets/burn-train-tui.png differ diff --git a/assets/ember-blazingly-fast.png b/assets/ember-blazingly-fast.png new file mode 100644 index 0000000000..1b3f751d5c Binary files /dev/null and b/assets/ember-blazingly-fast.png differ diff --git a/assets/ember-community.png b/assets/ember-community.png new file mode 100644 index 0000000000..9be467c502 Binary files /dev/null and b/assets/ember-community.png differ diff --git a/assets/ember-walking.png b/assets/ember-walking.png new file mode 100644 index 0000000000..5ca8b8b7ab Binary files /dev/null and b/assets/ember-walking.png differ diff --git a/assets/ember-wall.png b/assets/ember-wall.png new file mode 100644 index 0000000000..8f8f8a8457 Binary files /dev/null and b/assets/ember-wall.png differ diff --git a/assets/logo-burn-neutral.webp b/assets/logo-burn-neutral.webp new file mode 100644 index 0000000000..806e91cf67 Binary files /dev/null and b/assets/logo-burn-neutral.webp differ diff --git a/burn-ndarray/README.md b/burn-ndarray/README.md index b15ac7e5f5..61e8f75d07 100644 --- a/burn-ndarray/README.md +++ b/burn-ndarray/README.md @@ -19,3 +19,12 @@ The following flags support various BLAS options: Note, under the `no_std` mode, a random seed is generated during the build time if the seed is not initialized by by `Backend::seed` method. + +### Platform Support + +| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | +| :--------- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | +| Pure Rust | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | +| Accelerate | Yes | No | No | Yes | No | No | Yes | No | +| Netlib | Yes | No | Yes | Yes | Yes | No | No | No | +| Openblas | Yes | No | Yes | Yes | Yes | Yes | Yes | No | diff --git a/burn-tch/README.md b/burn-tch/README.md index 22d72fe9a3..0def647faf 100644 --- a/burn-tch/README.md +++ b/burn-tch/README.md @@ -43,3 +43,12 @@ mod tch_cpu { } } ``` + +### Platform Support + +| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | +| :----- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | +| CPU | Yes | No | Yes | Yes | Yes | Yes | Yes | No | +| CUDA | No | Yes | Yes | No | Yes | No | No | No | +| MPS | No | Yes | No | Yes | No | No | No | No | +| Vulkan | Yes | Yes | Yes | Yes | Yes | Yes | No | No | diff --git a/burn-wgpu/README.md b/burn-wgpu/README.md index 7cfeb09f50..66d71fd7e1 100644 --- a/burn-wgpu/README.md +++ b/burn-wgpu/README.md @@ -6,7 +6,7 @@ [![license](https://shields.io/badge/license-MIT%2FApache--2.0-blue)](https://github.com/burn-rs/burn-wgpu/blob/master/README.md) This crate provides a WGPU backend for [Burn](https://github.com/burn-rs/burn) using the -[wgpu](https://github.com/gfx-rs/wgpu). +[wgpu](https://github.com/gfx-rs/wgpu). The backend supports Vulkan, Metal, DirectX11/12, OpenGL, WebGPU. @@ -29,3 +29,13 @@ mod wgpu { ## Configuration You can set `BURN_WGPU_MAX_TASKS` to a positive integer that determines how many computing tasks are submitted in batches to the graphics API. + +## Platform Support + +| Option | CPU | GPU | Linux | MacOS | Windows | Android | iOS | WASM | +| :-------- | :-: | :-: | :---: | :---: | :-----: | :-----: | :-: | :--: | +| Metal | No | Yes | No | Yes | No | No | Yes | No | +| Vulkan | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | +| OpenGL | No | Yes | Yes | Yes | Yes | Yes | Yes | No | +| WebGpu | No | Yes | No | No | No | No | No | Yes | +| Dx11/Dx12 | No | Yes | No | No | Yes | No | No | No |