You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Like most Voice Assistants, Naomi's intents serve multiple purposes. First, they are used by the speech to text system to prepare a dictionary of words to recognize. Next they are converted into a language model to help the speech to text system guess what it is most likely hearing given the likelihoods of different arrangements of words. Finally, it is used by the text to intent system to figure out which intent to trigger.
When developing the format for creating grammars for Naomi speechhandler plugins, I created a structure format where the grammar is split into keywords and phrases, with keywords providing a list of options in a phrase. This was similar to the way grammars are constructed for intent parsing systems I have looked at and was a simple way to move Naomi from simply spotting keywords to reacting to more complex utterances, but has a few big problems:
in order to generate a new grammar for a plugin you have to edit the plugin
it is not particularly robust, and we would like to have additional keyword types such as numbers and dates to help developers
it is not standard, and a developer learning to generate a grammar for Naomi is not learning skills that will translate to other projects
There are a few grammar formats out there; JSGF, Nuance, ANTLR, SRGS, etc. SRGS seems to be a W3C specification, but I see very little support for it, and there is also a W3C specification for JSGF. JSGF has been around a long time and there is a pyJSGF library on PyPI which could be helpful. DeepSpeech/Coqui can use JSGF files directly, so I propose that we use JSGF grammar format for building Naomi intents, unless someone has a reason to prefer a different format.
Context
Possible Implementation
Your Environment
Version used:
Environment name and version (e.g. PHP 5.4 on nginx 1.9.1):
Server type and version:
Operating System and version:
Link to your project:
The text was updated successfully, but these errors were encountered:
aaronchantrill
changed the title
Use a standard grammar format
Use JSGF for writing intents
Jun 26, 2022
I think that SRGS is going to end up being a better choice. It's a bear to write and not very intuitive, but it is standard and I get the feeling that we will be seeing more of it in the future. It also supports named slots, which JSGF does not, making it more appropriate for writing intent templates than JSGF. It can also be used to provide lists of different ways of saying things so that Naomi can generate semi-random responses, which is one of the things I wanted to use JSGF for also. This is also a place where it seems like we could have an impact, since there currently does not appear to be a Python package for parsing SRGS files.
Detailed Description
Like most Voice Assistants, Naomi's intents serve multiple purposes. First, they are used by the speech to text system to prepare a dictionary of words to recognize. Next they are converted into a language model to help the speech to text system guess what it is most likely hearing given the likelihoods of different arrangements of words. Finally, it is used by the text to intent system to figure out which intent to trigger.
When developing the format for creating grammars for Naomi speechhandler plugins, I created a structure format where the grammar is split into keywords and phrases, with keywords providing a list of options in a phrase. This was similar to the way grammars are constructed for intent parsing systems I have looked at and was a simple way to move Naomi from simply spotting keywords to reacting to more complex utterances, but has a few big problems:
There are a few grammar formats out there; JSGF, Nuance, ANTLR, SRGS, etc. SRGS seems to be a W3C specification, but I see very little support for it, and there is also a W3C specification for JSGF. JSGF has been around a long time and there is a pyJSGF library on PyPI which could be helpful. DeepSpeech/Coqui can use JSGF files directly, so I propose that we use JSGF grammar format for building Naomi intents, unless someone has a reason to prefer a different format.
Context
Possible Implementation
Your Environment
The text was updated successfully, but these errors were encountered: