Wav2letter is a convolutional speech recognition neural network. This implementation was created by Arm, pruned to 50% sparisty, fine-tuned and quantized using the TensorFlow Model Optimization Toolkit.
Apache-2.0
The class labels associated with this model can be downloaded by running the script get_class_labels.sh
.
Code to recreate this model can be found here.
Platform |
Optimized |
Cortex-A |
✔️ |
Cortex-M |
✔️ |
Mali GPU |
✔️ |
Ethos U |
✔️ |
- ✔️ - Will run on this platform.
- ✖️ - Will not run on this platform.
Dataset: LibriSpeech
Optimization |
Value |
Quantization |
INT8 |
Sparsity |
50% |
Input Node Name |
Shape |
Description |
input_4 |
(1, 296, 39) |
Speech converted to MFCCs and quantized to INT8 |
Output Node Name |
Shape |
Description |
Identity |
(1, 1, 148, 29) |
A tensor of (batch, time, class probabilities) that represents the probability of each class at each timestep. Should be passed to a decoder e.g. ctc_beam_search_decoder. |