-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation related to citation and sensitivity label (#5299)
* Add citation doc * Add design decisions * Update graph * Update graph * Change to "EncryptedContent" * Verbiage * Add more terminology * Verbiage * Verbiage * Verbiage * Fix quotes * Add screenshot * Verbiage * Move sections * Apply suggestions from code review Co-authored-by: Eugene <[email protected]> * Typo --------- Co-authored-by: Eugene <[email protected]>
- Loading branch information
Showing
3 changed files
with
206 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,206 @@ | ||
# Citation | ||
|
||
This document describes how citation works in Web Chat. | ||
|
||
## Background | ||
|
||
Adding citation in a message helps customers better understand the response. | ||
|
||
Web Chat added citation support since [4.16.0](../CHANGELOG.md#4160---2023-11-16). | ||
|
||
## Terminology | ||
|
||
### Markdown reference style link | ||
|
||
A type of link in Markdown that is placed below the paragraph. Reduce repetitiveness when multiple words shared the same link and improve readability for plain text users. | ||
|
||
CommonMark-compliant Markdown engines will inline the link into the paragraph and should not render it differently than normal inline links. | ||
|
||
### Sensitivity label | ||
|
||
Optional metadata indicating how the subject should be handled, often used to specify the confidentiality level or intended audience scope. | ||
|
||
### Activity graph | ||
|
||
Activity graph is the collective of all Schema.org things (a.k.a. objects) inside `entities` field. These things must have `type` field under https://schema.org domain and conform to JSON-LD specification. | ||
|
||
The activity graph should have a [Message thing](#message-thing). | ||
|
||
### Message thing | ||
|
||
A [Message thing](https://schema.org/Message) represent the message activity itself and act as the root of the [activity graph](#activity-graph). It must have the following fields: | ||
|
||
- `@context` of `"https://schema.org"` | ||
- `@id` of `""` (an empty string means self in JSON-LD fashion) | ||
- `@type` of `"Message"` | ||
- `type` of `"https://schema.org/Message"` | ||
|
||
### Non-URL citation | ||
|
||
Citation which does not have a link to point the user to. But part or whole of the cited content is attached in the activity. | ||
|
||
## Bot implementation | ||
|
||
This section outlines the key components and steps for implementing citations in bot responses. By following these guidelines, developers can ensure their bots generate properly formatted citations that can be rendered effectively in Web Chat and other supported channels. | ||
|
||
Bot developers should implement citations as outlined in this section to ensure proper functionality. | ||
|
||
### Activity text | ||
|
||
`activity.text` must be in Markdown format and links for citation must format as [reference style link](#markdown-reference-style-link). | ||
|
||
#### Sample payload | ||
|
||
Notes: | ||
|
||
- The third citation is a non-URL citation, its link `cite:1` is currently ignored | ||
|
||
``` | ||
Sure, you should override the default proxy settings[1][2], when your proxy server requires authentication[3]. | ||
[1]: https://support.microsoft.com/en-us/windows/use-a-proxy-server-in-windows-03096c53-0554-4ffe-b6ab-8b1deee8dae1 "Use a proxy server in Windows" | ||
[2]: https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/configure-proxy-server-settings "Configure proxy server settings - Windows Server" | ||
[3]: cite:1 "Introduction Configuring proxy settings is a fundamental aspect..." | ||
``` | ||
|
||
#### Sample output | ||
|
||
![Sample citation output](../media/citation-example.png) | ||
|
||
### Message object | ||
|
||
![Direct Line activity graph](../media/direct-line-activity-graph.png) | ||
|
||
Please refer to the graph for details of each fields. Notably: | ||
|
||
- Only compact from is supported (i.e. nested objects), other forms and object references are not supported unless stated otherwise | ||
- Subclasses are not supported. If the object is expected to be `Message`, it must not be `EmailMessage` (subclass) | ||
|
||
#### Sample payload | ||
|
||
```json | ||
{ | ||
"@context": "https://schema.org", | ||
"@id": "", | ||
"@type": "Message", | ||
"citation": [ | ||
{ | ||
"@id": "https://support.microsoft.com/en-us/windows/use-a-proxy-server-in-windows-03096c53-0554-4ffe-b6ab-8b1deee8dae1", | ||
"@type": "Claim", | ||
"appearance": { | ||
"@type": "DigitalDocument", | ||
"encodingFormat": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", | ||
"url": "https://support.microsoft.com/en-us/windows/use-a-proxy-server-in-windows-03096c53-0554-4ffe-b6ab-8b1deee8dae1", | ||
"usageInfo": { | ||
"@type": "CreativeWork", | ||
"additionalType": "https://copilotstudio.microsoft.com/sensitivity-labels/v1", | ||
"name": "General" | ||
} | ||
}, | ||
"claimInterpreter": { | ||
"@type": "Project", | ||
"slogan": "Surfaced with Azure OpenAI", | ||
"url": "https://www.microsoft.com/en-us/ai/responsible-ai" | ||
}, | ||
"position": "1" | ||
}, | ||
{ | ||
"@id": "https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/configure-proxy-server-settings", | ||
"@type": "Claim", | ||
"appearance": { | ||
"@type": "DigitalDocument", | ||
"dateModified": "2024-03-01T15:56:27.000-0800", | ||
"editor": "William Wong", | ||
"encodingFormat": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", | ||
"url": "https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/configure-proxy-server-settings" | ||
}, | ||
"claimInterpreter": { | ||
"@type": "Project", | ||
"slogan": "Surfaced with Azure OpenAI", | ||
"url": "https://www.microsoft.com/en-us/ai/responsible-ai" | ||
}, | ||
"position": "2" | ||
}, | ||
{ | ||
"@id": "_:c1", | ||
"@type": "Claim", | ||
"appearance": { | ||
"@type": "DigitalDocument", | ||
"abstract": "Aute Lorem id laboris Lorem do dolor mollit. Officia dolore dolor do culpa nostrud velit officia...", | ||
"encodingFormat": "text/markdown", | ||
"name": "Sample Citation From File", | ||
"text": "Aute Lorem id laboris Lorem do dolor mollit. Officia dolore dolor do culpa nostrud velit officia magna ut aute pariatur excepteur ut cupidatat.", | ||
"usageInfo": { | ||
"@id": "_:s1", | ||
"@type": "CreativeWork", | ||
"additionalType": "https://copilotstudio.microsoft.com/sensitivity-labels/v1", | ||
"description": "Data is classified as Confidential but is NOT PROTECTED to allow access by approved NDA business partners. If a higher level of protection is needed please change the sensitivity level of the cited content.", | ||
"keywords": ["EncryptedContent"], | ||
"name": "Confidential\\Any User (No Protection)", | ||
"pattern": { | ||
"@type": "DefinedTermSet", | ||
"inDefinedTermSet": "https://www.w3.org/TR/css-color-4/", | ||
"name": "color", | ||
"termCode": "orange" | ||
} | ||
} | ||
}, | ||
"claimInterpreter": { | ||
"@type": "Project", | ||
"slogan": "Surfaced with Azure OpenAI", | ||
"url": "https://www.microsoft.com/en-us/ai/responsible-ai" | ||
}, | ||
"position": "3" | ||
} | ||
], | ||
"keywords": ["AIGeneratedContent"], | ||
"type": "https://schema.org/Message", | ||
"usageInfo": { | ||
"@id": "_:s1" | ||
} | ||
} | ||
``` | ||
|
||
## Design decisions | ||
|
||
The following design decisions were made to optimize the citation implementation for our needs. These choices reflect our approach to balancing technical requirements with user experience considerations. | ||
|
||
### Markdown reference style link | ||
|
||
This type of link allows multiple words/badges to reference to the same link, which is common when citing sources. It also helps readability when users are on text channels (email/SMS). | ||
|
||
According to CommonMark, Markdown engines should not render the reference style links separately. Channels which does not support citation should continue rendering links normally. | ||
|
||
Web Chat appended an additional UI component below the Markdown component. It visualizes the reference style link in a footnote fashion and render sensitivity labels and other UIs. | ||
|
||
### Markdown pure identifiers for badges | ||
|
||
Pure identifiers is a type of link identifiers which does not have a specific word linked. Instead, it use the identifier itself as a link. | ||
|
||
We are rendering pure identifiers as citation badges. | ||
|
||
Example of text without pure identifiers: `Here is the [link][1] about copilot.` would have the word "link" referencing the link indicated by `1` in the reference style link. It would render as "Here is the link about copilot." with the word "link" usually underlined or colored. | ||
|
||
Example of text With pure identifiers, `Here is the link[1] about copilot.` would have the identifier "1" added to the text and we would use this as a badge. The message would render as "Here is the link 1 about copilot." | ||
|
||
In Web Chat, the identifier "1" would be styled as a badge and may have a "open in new tab" icon and additional text alternative for accessibility. | ||
|
||
Note: some Markdown engine treats `link[1][1]` and `link[1]` synonymically. | ||
|
||
### Using `position` instead of `@id` to match links from Markdown to Claim | ||
|
||
We use `position` instead of `@id` to match the link definition in Markdown to the Claim thing in the activity graph. This simplifies the matching process and maintains an ordinal relationship between citations in the text and their corresponding Claims. | ||
|
||
### Source of truths | ||
|
||
If there are deviations of information in Markdown and Message thing, the Message thing should take precedence over the Markdown, given the receiver understood the Message thing. | ||
|
||
### `usageInfo` on the `Message` thing should be a blank node | ||
|
||
In JSON-LD, blank node means a node that does not have any contents but `@id` and is intended for referencing other nodes in the system. | ||
|
||
`Message.usageInfo` is representing sensitivity label for the whole message. The sensitivity label it is referring to is usually the sensitivity label which has highest priority or importance. | ||
|
||
## Further reading | ||
|
||
- [Microsoft Teams: Bot messages with AI-generated content](https://learn.microsoft.com/en-us/microsoftteams/platform/bots/how-to/bot-messages-ai-generated-content?tabs=after%2Cbotmessage#citations) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.