Skip to content

Commit

Permalink
Merge pull request #22 from ARM-software/wav2letter_pruned
Browse files Browse the repository at this point in the history
Added Wav2letter Pruned INT8
  • Loading branch information
tom-arm authored May 18, 2021
2 parents 35040a9 + 1a92aa0 commit ed37a3b
Show file tree
Hide file tree
Showing 8 changed files with 181 additions and 0 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,15 @@
<th width="100">Mali GPU</th>
<th width="100">Ethos U</th>
</tr>
<tr>
<td><a href="models/speech_recognition/wav2letter/tflite_pruned_int8">Wav2letter Pruned INT8</a></td>
<td align="center">INT8</td>
<td align="center">TensorFlow Lite</td>
<td align="center">:heavy_check_mark:</td>
<td align="center">:heavy_check_mark:</td>
<td align="center">:heavy_check_mark:</td>
<td align="center">:heavy_check_mark:</td>
</tr>
<tr>
<td><a href="models/speech_recognition/wav2letter/tflite_int8">Wav2letter INT8</a></td>
<td align="center">INT8</td>
Expand Down
73 changes: 73 additions & 0 deletions models/speech_recognition/wav2letter/tflite_pruned_int8/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Wav2letter Pruned INT8

## Description
Wav2letter is a convolutional speech recognition neural network. This implementation was created by Arm, pruned to 50% sparisty, fine-tuned and quantized using the TensorFlow Model Optimization Toolkit.

## License
[Apache-2.0](https://spdx.org/licenses/Apache-2.0.html)

## Related Materials
### Class Labels
The class labels associated with this model can be downloaded by running the script `get_class_labels.sh`.

## Network Information
| Network Information | Value |
|---------------------|----------------|
| Framework | TensorFlow Lite |
| SHA-1 Hash | e389797705f5f8a7973c3280954dd5cdf54284a1 |
| Size (Bytes) | 23815520 |
| Provenance | https://github.com/ARM-software/ML-zoo/tree/master/models/speech_recognition/wav2letter/tflite_pruned_int8 |
| Paper | https://arxiv.org/abs/1609.03193 |

## Performance
| Platform | Optimized |
|----------|:---------:|
| Cortex-A |:heavy_check_mark: |
| Cortex-M |:heavy_check_mark: |
| Mali GPU |:heavy_check_mark: |
| Ethos U |:heavy_check_mark: |

### Key
* :heavy_check_mark: - Will run on this platform.
* :heavy_multiplication_x: - Will not run on this platform.

## Accuracy
Dataset: LibriSpeech

| Metric | Value |
|--------|-------|
| LER | 0.07981431 |

## Optimizations
| Optimization | Value |
|--------------|---------|
| Quantization | INT8 |
| Sparsity | 50% |

## Network Inputs
<table>
<tr>
<th width="200">Input Node Name</th>
<th width="100">Shape</th>
<th width="300">Description</th>
</tr>
<tr>
<td>input_2_int8</td>
<td>(1, 296, 39)</td>
<td>Speech converted to MFCCs and quantized to INT8</td>
</tr>
</table>

## Network Outputs
<table>
<tr>
<th width="200">Output Node Name</th>
<th width="100">Shape</th>
<th width="300">Description</th>
</tr>
<tr>
<td>Identity_int8</td>
<td>(1, 1, 148, 29)</td>
<td>A tensor of (batch, time, class probabilities) that represents the probability of each class at each timestep. Should be passed to a decoder e.g. ctc_beam_search_decoder.</td>
</tr>
</table>
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
benchmark:
LibriSpeech:
LER: 0.07981431
description: Wav2letter is a convolutional speech recognition neural network. This
implementation was created by Arm, pruned to 50% sparisty, fine-tuned and quantized
using the TensorFlow Model Optimization Toolkit.
license:
- Apache-2.0
network:
file_size_bytes: 23815520
filename: wav2letter_pruned_int8.tflite
framework: TensorFlow Lite
hash:
algorithm: sha1
value: e389797705f5f8a7973c3280954dd5cdf54284a1
provenance: https://github.com/ARM-software/ML-zoo/tree/master/models/speech_recognition/wav2letter/tflite_pruned_int8
network_parameters:
input_nodes:
- description: Speech converted to MFCCs and quantized to INT8
example_input:
path: models/speech_recognition/wav2letter/tflite_pruned_int8/testing_input/input_2_int8
name: input_2_int8
shape:
- 1
- 296
- 39
type: int8
output_nodes:
- description: A tensor of (batch, time, class probabilities) that represents the
probability of each class at each timestep. Should be passed to a decoder e.g.
ctc_beam_search_decoder.
name: Identity_int8
shape:
- 1
- 1
- 148
- 29
test_output_path: models/speech_recognition/wav2letter/tflite_pruned_int8/testing_output/Identity_int8
operators:
TensorFlow Lite:
- CONV_2D
- RESHAPE
- LEAKY_RELU
- SOFTMAX
paper: https://arxiv.org/abs/1609.03193
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright (C) 2021 Arm Limited or its affiliates. All rights reserved.
#
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the License); you may
# not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an AS IS BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#!/usr/bin/env bash

python scripts/create_labels.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (C) 2021 Arm Limited or its affiliates. All rights reserved.
#
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the License); you may
# not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an AS IS BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

def get_label_dict():
alphabet = "abcdefghijklmnopqrstuvwxyz' @"
return [c for c in alphabet]

if __name__ == "__main__":
labels = get_label_dict()

with open("labelmappings.txt", "w") as f:
for l in labels:
f.write('{}\n'.format(l))
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown

0 comments on commit ed37a3b

Please sign in to comment.