Skip to content

Commit

Permalink
doc: Update the doc of multinet
Browse files Browse the repository at this point in the history
  • Loading branch information
sun-xiangyu committed Mar 6, 2023
1 parent 9a36067 commit a5a0473
Show file tree
Hide file tree
Showing 5 changed files with 102 additions and 13 deletions.
30 changes: 24 additions & 6 deletions docs/en/speech_command_recognition/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,19 +42,19 @@ Please see the flow diagram for commands recognition below:

.. _command-requirements:

Requirements of Speech Commands
Format of Speech Commands
-------------------------------

Currently, MultiNet supports up to **200** commands. There are some limitation when designing speech commands:
Different MultiNets support different format:

- Chinese

Use Pinyin for Chinese speech commands, and add a space in between. For example, the Chinese speech command for turning on the air conditioner is "da kai kong tiao"; the Chinese speech command for turning on the green light is "da kai lv se deng".
MultiNet5 and MultiNet6 sse Pinyin for Chinese speech commands. Please use :project_file:`tool/multinet_pinyin.py` to get pinyin of Chinese.

- English

Use phonetic symbols for English speech commands, and add a space in between. For example, the English speech command for turnning on the light is "TkN nN jc LiT". Users can use the tool provided by us to do the convention. To find this tool, go to :project_file:`tool/multinet_g2p.py` .

MultiNet5 use phonemes for English speech commands. Simplicity, we use chats to denote different phoneme.Please use :project_file:`tool/multinet_g2p.py` to do the convention.
MultiNet6 use grapheme for English speech commands. You do not need any convention.

Suggestions on Customizing Speech Commands
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -73,7 +73,25 @@ When customizing speech command words, please pay attention to the following sug
Speech Commands Customization Methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Multinet supports flexible methods to customize speech commands. Users can do it either online or offline and can also add/delete/modify speech commands dynamically.
MultiNet6 customize speech commands:

- For English, words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:

::

# command_id command_sentence
1 TELL ME A JOKE
2 MAKE A COFFEE

- For Chinese, pinyin are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_cn.txt` by the following format. :project_file:`tool/multinet_pinyin.py` help tp get Pinyin of Chinese.

::

# command_id command_sentence
1 da kai kong tiao
2 guan bi kong tiao

Multinet5 supports flexible methods to customize speech commands. Users can do it either online or offline and can also add/delete/modify speech commands dynamically.

.. only:: latex

Expand Down
33 changes: 28 additions & 5 deletions docs/zh_CN/speech_command_recognition/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,18 +42,19 @@ MultiNet 输入为经过前端语音算法(AFE)处理过的音频(格式

.. _command-requirements:

命令词设计要求
命令词格式要求
----------------

目前,MultiNet 最多支持 **200** 条命令词。命令词需要满足特定的格式,具体如下:
不同版本的MultiNet命令词格式不同。命令词需要满足特定的格式,具体如下:

- 中文

中文命令词需要使用汉语拼音,并且每个字的拼音拼写间要间隔一个空格。比如“打开空调”,应该写成 “da kai kong tiao”,比如“打开绿色灯”,需要写成“da kai lv se deng”
MultiNet5和MultiNet6使用汉语拼音作为基本识别单元,并且每个字的拼音拼写间隔一个空格。比如“打开空调”,应该写成 “da kai kong tiao”,请使用以下工具将汉字转为拼音: :project_file:`tool/multinet_pinyin.py`

- 英文

英文命令词需要使用特定音标表示,每个单词的音标间用空格隔开,比如“turn on the light”,需要写成“TkN nN jc LiT”。具体可使用我们提供的工具进行转换,详细可见: :project_file:`tool/multinet_g2p.py` 。
MultiNet5: 使用音标作为基本识别单元。为简单起见,将每个音标映射为单个字母表示,比如“turn on the light”,需要写成“TkN nN jc LiT”。请使用我们提供的工具进行转换,详细可见: :project_file:`tool/multinet_g2p.py` 。
MultiNet6: 使用subwords作为识别单元,用户可以直接输入所需短语。比如“turn on the light”,直接写为“turn on the light”即可。


自定义要求
Expand Down Expand Up @@ -83,7 +84,29 @@ MultiNet 支持多种且灵活的命令词设置方式,可通过在线或离
离线设置命令词
^^^^^^^^^^^^^^^

MultiNet 支持两种离线设置命令词的方法:
MultiNet6 离线设置命令词的方法:

- 中文通过修改 :project_file:`model/multinet_model/fst/commands_cn.txt`

格式如下,第一个数字代表command id, 后面为指令的中文拼音,两者由空格隔开,拼音间也由空格隔开

::

# command_id command_sentence
1 da kai kong tiao
2 guan bi kong tiao

- 英语通过修改 :project_file:`model/multinet_model/fst/commands_en.txt`

格式如下,第一个数字代表command id, 后面为指令的英语短语,两者由空格隔开,单词间也由空格隔开

::

# command_id command_sentence
1 TELL ME A JOKE
2 MAKE A COFFEE

MultiNet5 支持两种离线设置命令词的方法:

- 通过 ``menuconfig``

Expand Down
2 changes: 1 addition & 1 deletion tool/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ For English, words are used as units. Please prepare a list of commands written
2 MAKE A COFFEE
```

For Chinese, pinyin are used as units. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
For Chinese, pinyin are used as units. [multinet_pinyin.py](./multinet_pinyin.py) help tp get Pinyin of Chinese. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
```
# command_id command_sentence
1 da kai kong tiao
Expand Down
47 changes: 47 additions & 0 deletions tool/multinet_pinyin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from pypinyin import lazy_pinyin
from pypinyin import load_phrases_dict
from pypinyin import load_single_dict
from pypinyin_dict.phrase_pinyin_data import large_pinyin
import argparse


phrases_dict = {
'开户行': [['ka1i'], ['hu4'], ['hang2']],
'发卡行': [['fa4'], ['ka3'], ['hang2']],
'放款行': [['fa4ng'], ['kua3n'], ['hang2']],
'茧行': [['jia3n'], ['hang2']],
'行号': [['hang2'], ['ha4o']],
'各地': [['ge4'], ['di4']],
'借还款': [['jie4'], ['hua2n'], ['kua3n']],
'时间为': [['shi2'], ['jia1n'], ['we2i']],
'为准': [['we2i'], ['zhu3n']],
'色差': [['se4'], ['cha1']],
'嗲': [['dia3']],
'呗': [['bei5']],
'不': [['bu4']],
'咗': [['zuo5']],
'嘞': [['lei5']],
'掺和': [['chan1'], ['huo5']]
}

def init_pinyin():
large_pinyin.load()
load_phrases_dict(phrases_dict)
load_single_dict({22320: u'de,di4'}) # 地
load_single_dict({35843: u'tiao2,diao4'}) #调

def get_pinyin(text):
label = lazy_pinyin(text)
label = " ".join(label)

print("in:", text)
print("out:", label)


if __name__ == "__main__":
parser = argparse.ArgumentParser(prog="Chinese Speech Commands")
parser.add_argument("--text", "-t", type=str, default=None, help="input text")
args = parser.parse_args()

init_pinyin()
get_pinyin(args.text)
3 changes: 2 additions & 1 deletion tool/requirements
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
g2p-en
sentencepiece==0.1.97
pypinyin
pypinyin_dict

0 comments on commit a5a0473

Please sign in to comment.