You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My current thinking is that in order to support testing from audio input, we need to look at the end-state of a conversation, not the intermediate state of NLU results.
It begs the question on what we can really test from audio. While I believe it is still useful to test intents, it is challenging to test entities, and the only real way to validate that the correct entities were detected is to look at an end-state (such as a fully resolved / disambiguated entity from text).
It may be that we want to recommend an alternative of testing from labeled ASR generated text, rather than testing from the audio itself.
The text was updated successfully, but these errors were encountered:
Are you suggesting here to remove support for audio completely ? Or keep it to test intents but stop support for entities
I like the idea of labeling the ASR generated text instead because there is still value in the comparison NLU.DevOps is providing. However, this would be measuring something slightly different. What if the expected text is :
"call eric on his cell" -> intent : Call_Person
And what we get from ASR is completely different for some reason like
"Caloric on his cell" -> how would we label this intent? would it still be "Call_Person" or "None"
My current thinking is that in order to support testing from audio input, we need to look at the end-state of a conversation, not the intermediate state of NLU results.
The main challenge here is described in #241
It begs the question on what we can really test from audio. While I believe it is still useful to test intents, it is challenging to test entities, and the only real way to validate that the correct entities were detected is to look at an end-state (such as a fully resolved / disambiguated entity from text).
It may be that we want to recommend an alternative of testing from labeled ASR generated text, rather than testing from the audio itself.
The text was updated successfully, but these errors were encountered: