diff --git a/README.md b/README.md index da57a7b..5dc8903 100644 --- a/README.md +++ b/README.md @@ -38,6 +38,7 @@ Detailed information on the CLI tool sub-commands and arguments can be found in - [Tearing down an NLU model](docs/Clean.md) - [Analyzing NLU model results](docs/Compare.md) - [Generic utterances model](docs/GenericUtterances.md) +- [Extending the generic utterance model](docs/UtteranceExtensions.md) - [LUIS model configuration](docs/LuisModelConfiguration.md) - [LUIS endpoint configuration](docs/LuisEndpointConfiguration.md) - [Lex bot configuration](docs/LexModelConfiguration.md) diff --git a/_config.yml b/_config.yml index 9beea28..07f4685 100644 --- a/_config.yml +++ b/_config.yml @@ -7,6 +7,7 @@ relative_links: include: - README.md - docs/GenericUtterances.md + - docs/UtteranceExtensions.md - docs/Train.md - docs/Test.md - docs/Clean.md diff --git a/docs/UtteranceExtensions.md b/docs/UtteranceExtensions.md new file mode 100644 index 0000000..03721c6 --- /dev/null +++ b/docs/UtteranceExtensions.md @@ -0,0 +1,67 @@ +# Extending the generic utterance model + +The [generic utterance model](GenericUtterances.md) only covers the text, intent and entities for a given NLU scenario. For some NLU providers, we may want to include additional context, such as a confidence score for the intent prediction or a timestamp for when the request was made. This document covers some of the common extensions to the generic utterance model used across the LUIS, Lex, and Dialogflow NLU providers. + +## Returning confidence scores for text, intents, and entities + +When an NLU provider in NLU.DevOps returns a prediction result, the value will be serialized as-is, meaning any additional properties included in the result will be serialized as well. Currently, LUIS and Dialogflow return text transcription and intent confidence scores in `textScore` and `score` properties, respectively. E.g.: +```json +{ + "text": "play a rock song", + "intent": "PlayMusic", + "entities": [ + { + "matchText": "rock", + "entityType": "genre", + "score": 0.80 + } + ], + "score": 0.99, + "textScore": 0.95 +} +``` +In this case, the intent confidence score was `0.99` and the text transcription confidence score was `0.95`. This is useful context when debugging false predictions, as a low confidence score may indicate that the model could be improved with more training examples. The recognized `genre` entity also includes a confidence score of `0.80`, although it should be noted that only the LUIS provider currently returns confidence score for entity types trained from examples. + +## Labeled utterance timestamps + +When analyzing results for a set of NLU predictions, it is often important context to understand when the test was run. For example, for Dialogflow `date` and `time` entities, the service only returns a date time string, and no indication of what token(s) triggered that entity to be recognized. For example, the result from a query like `"Call a taxi in 15 minutes"` may look like the following: +```json +{ + "text": "call a taxi in 15 minutes", + "intent": "ScheduleTaxi", + "entities": [ + { + "entityType": "time", + "entityValue": "2020-01-01T00:15:00-04:00" + } + ], + "timestamp": "2020-01-01T00:00:00-04:00" +} +``` +Without the context provided by the `timestamp` property, we wouldn't be able to make any assertion about the correctness of the `entityValue` property for time. Currently, LUIS, Lex, and Dialogflow return a timestamp for each prediction result. + +## Utterance Extension Properties + +### `score` + +The confidence score for the intent in the NLU prediction. + +### `textScore` + +The confidence score for the text transcription, in case the test was run from speech. + +### `timestamp` + +The timestamp for when the NLU prediction was made. + +### `utteranceId` + +Used for aggregation of [`compare`](Compare.md) command results. + +Each NLU prediction may produce multiple classification results (e.g., true positive intent and true n egative entities). When an `utteranceId` is provided on a given utterance model, it will be included as metadata for each classification result produced when running the [`compare`](Compare.md) command. + +## Entity Extension Properties + +### `score` + +The confidence score for the entity in the NLU prediction.