Skip to content
/ asr Public

automatic speech recognition huggingface pipeline

Notifications You must be signed in to change notification settings

neeland/asr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Authors: [email protected]

🎙 ASR

Automatic Speech Recognition

156428778-ed57f87b-3c4f-4af4-88b9-f31360553a8e-removebg-preview

Code to implement Hugging Face (🤗) pipeline on Azure machines, transcribing .wav (converted from client's .amr)

TODO: items are throughout code & documentation

Client data challenges

  • South African accents are very thick → need for fine-tuning
  • Code-switching between English & other African languages
  • Some audio is completely inaudible
  • Some audio is completely in a different language - need for classification here

Solution chosen

Contents

Each folder has a markdown .md file explaining each file in the folder

asr
|
├── dev       : development
│ 
├── mount     : mounted input files
│ 
├── prod      : production process
│ 
└── README.md : >> you are here <<

Folder README links

Describes relevant folder's files

▶️ Inference instructions

To transcribe a new batch of client data...

1️⃣ Connect to client SFTP, download necessary data locally, convert .amr to .wav using:

  • Cyber Duck is a stand-alone app for SFTP connection. Download files locally using SFTP

    • TODO: automate client SFTPT → blob process, triggering conversion & inference when new data appears
  • prod/inference/env/setup_amr2wav.sh to set up environment

  • prod/inference/amr2wav.py to convert .amr.wav

    > git clone https://github.com/elucidate-ai/asr
    > cd asr/prod/inference
    > bash env/setup_inference2csv.sh
    python inference2csv.py
    

Then upload to Azure blob storage

2️⃣ Create appropriate Azure GPU machine for inference

  • Azure ML Portal used to create machines (Compute > + New)

    Note on GPU needed

    Only following Azure ML machines will work for such large models:

    1 x NVIDIA Tesla P100

    1 x NVIDIA Tesla V100

3️⃣ Connect to terminal of the machine you just created & clone repo

4️⃣ Mount Azure storage blob using mount/mount_blob.py

5️⃣ Run production

Useful links

About

automatic speech recognition huggingface pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages