Skip to content

Commit

Permalink
Adds documentation for utterance extensions
Browse files Browse the repository at this point in the history
Includes documentation on confidence scores and timestamps that are now included in results from LUIS, Lex, and Dialogflow.
  • Loading branch information
rozele committed Oct 11, 2019
1 parent 564e136 commit 2be6d93
Show file tree
Hide file tree
Showing 3 changed files with 69 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Detailed information on the CLI tool sub-commands and arguments can be found in
- [Tearing down an NLU model](docs/Clean.md)
- [Analyzing NLU model results](docs/Compare.md)
- [Generic utterances model](docs/GenericUtterances.md)
- [Extending the generic utterance model](docs/UtteranceExtensions.md)
- [LUIS model configuration](docs/LuisModelConfiguration.md)
- [LUIS endpoint configuration](docs/LuisEndpointConfiguration.md)
- [Lex bot configuration](docs/LexModelConfiguration.md)
Expand Down
1 change: 1 addition & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ relative_links:
include:
- README.md
- docs/GenericUtterances.md
- docs/UtteranceExtensions.md
- docs/Train.md
- docs/Test.md
- docs/Clean.md
Expand Down
67 changes: 67 additions & 0 deletions docs/UtteranceExtensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Extending the generic utterance model

The [generic utterance model](GenericUtterances.md) only covers the text, intent and entities for a given NLU scenario. For some NLU providers, we may want to include additional context, such as a confidence score for the intent prediction or a timestamp for when the request was made. This document covers some of the common extensions to the generic utterance model used across the LUIS, Lex, and Dialogflow NLU providers.

## Returning confidence scores for text, intents, and entities

When an NLU provider in NLU.DevOps returns a prediction result, the value will be serialized as-is, meaning any additional properties included in the result will be serialized as well. Currently, LUIS and Dialogflow return text transcription and intent confidence scores in `textScore` and `score` properties, respectively. E.g.:
```json
{
"text": "play a rock song",
"intent": "PlayMusic",
"entities": [
{
"matchText": "rock",
"entityType": "genre",
"score": 0.80
}
],
"score": 0.99,
"textScore": 0.95
}
```
In this case, the intent confidence score was `0.99` and the text transcription confidence score was `0.95`. This is useful context when debugging false predictions, as a low confidence score may indicate that the model could be improved with more training examples. The recognized `genre` entity also includes a confidence score of `0.80`, although it should be noted that only the LUIS provider currently returns confidence score for entity types trained from examples.

## Labeled utterance timestamps

When analyzing results for a set of NLU predictions, it is often important context to understand when the test was run. For example, for Dialogflow `date` and `time` entities, the service only returns a date time string, and no indication of what token(s) triggered that entity to be recognized. For example, the result from a query like `"Call a taxi in 15 minutes"` may look like the following:
```json
{
"text": "call a taxi in 15 minutes",
"intent": "ScheduleTaxi",
"entities": [
{
"entityType": "time",
"entityValue": "2020-01-01T00:15:00-04:00"
}
],
"timestamp": "2020-01-01T00:00:00-04:00"
}
```
Without the context provided by the `timestamp` property, we wouldn't be able to make any assertion about the correctness of the `entityValue` property for time. Currently, LUIS, Lex, and Dialogflow return a timestamp for each prediction result.

## Utterance Extension Properties

### `score`

The confidence score for the intent in the NLU prediction.

### `textScore`

The confidence score for the text transcription, in case the test was run from speech.

### `timestamp`

The timestamp for when the NLU prediction was made.

### `utteranceId`

Used for aggregation of [`compare`](Compare.md) command results.

Each NLU prediction may produce multiple classification results (e.g., true positive intent and true n egative entities). When an `utteranceId` is provided on a given utterance model, it will be included as metadata for each classification result produced when running the [`compare`](Compare.md) command.

## Entity Extension Properties

### `score`

The confidence score for the entity in the NLU prediction.

0 comments on commit 2be6d93

Please sign in to comment.