-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c0137cf
commit b5d8c70
Showing
9 changed files
with
527 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,294 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_static.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# How to Set Up a Kili LLM Static project" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In this tutorial you'll learn how to create and import conversations in a Kili project with a custom interface for comparing LLM outputs.\n", | ||
"\n", | ||
"\n", | ||
"Here are the steps we will follow:\n", | ||
"\n", | ||
"1. Creating a Kili project with a custom interface\n", | ||
"2. Import three conversations to the project" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Creating a Kili Project with a Custom Interface" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We will create a Kili project with a custom interface that includes several jobs for comparing LLM outputs.\n", | ||
"\n", | ||
"### Defining Three Levels of Annotation Jobs\n", | ||
"\n", | ||
"To streamline the annotation process, we define three distinct levels of annotation jobs:\n", | ||
"\n", | ||
"- **Completion:** This job enables annotators to evaluate individual responses generated by LLMs. Each response is annotated separately.\n", | ||
"\n", | ||
"- **Round:** This job allows annotators to assess a single round of conversation, grouping all the LLM responses within that round under a single annotation.\n", | ||
"\n", | ||
"- **Conversation:** This job facilitates annotation at the conversation level, where the entire exchange can be evaluated as a whole.\n", | ||
"\n", | ||
"In this example, we use a JSON interface that incorporates classifications at all these levels, enabling comprehensive annotation:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"interface = {\n", | ||
" \"jobs\": {\n", | ||
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"TOO_SHORT\": {\"children\": [], \"name\": \"Too short\", \"id\": \"category1\"},\n", | ||
" \"JUST_RIGHT\": {\"children\": [], \"name\": \"Just right\", \"id\": \"category2\"},\n", | ||
" \"TOO_VERBOSE\": {\"children\": [], \"name\": \"Too verbose\", \"id\": \"category3\"},\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Verbosity\",\n", | ||
" \"level\": \"completion\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_1\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category4\"},\n", | ||
" \"MINOR_ISSUES\": {\"children\": [], \"name\": \"Minor issue(s)\", \"id\": \"category5\"},\n", | ||
" \"MAJOR_ISSUES\": {\"children\": [], \"name\": \"Major issue(s)\", \"id\": \"category6\"},\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Instructions Following\",\n", | ||
" \"level\": \"completion\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_2\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category7\"},\n", | ||
" \"MINOR_INACCURACY\": {\n", | ||
" \"children\": [],\n", | ||
" \"name\": \"Minor inaccuracy\",\n", | ||
" \"id\": \"category8\",\n", | ||
" },\n", | ||
" \"MAJOR_INACCURACY\": {\n", | ||
" \"children\": [],\n", | ||
" \"name\": \"Major inaccuracy\",\n", | ||
" \"id\": \"category9\",\n", | ||
" },\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Truthfulness\",\n", | ||
" \"level\": \"completion\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_3\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category10\"},\n", | ||
" \"MINOR_SAFETY_CONCERN\": {\n", | ||
" \"children\": [],\n", | ||
" \"name\": \"Minor safety concern\",\n", | ||
" \"id\": \"category11\",\n", | ||
" },\n", | ||
" \"MAJOR_SAFETY_CONCERN\": {\n", | ||
" \"children\": [],\n", | ||
" \"name\": \"Major safety concern\",\n", | ||
" \"id\": \"category12\",\n", | ||
" },\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Harmlessness/Safety\",\n", | ||
" \"level\": \"completion\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"COMPARISON_JOB\": {\n", | ||
" \"content\": {\n", | ||
" \"options\": {\n", | ||
" \"IS_MUCH_BETTER\": {\"children\": [], \"name\": \"Is much better\", \"id\": \"option13\"},\n", | ||
" \"IS_BETTER\": {\"children\": [], \"name\": \"Is better\", \"id\": \"option14\"},\n", | ||
" \"IS_SLIGHTLY_BETTER\": {\n", | ||
" \"children\": [],\n", | ||
" \"name\": \"Is slightly better\",\n", | ||
" \"id\": \"option15\",\n", | ||
" },\n", | ||
" \"TIE\": {\"children\": [], \"name\": \"Tie\", \"mutual\": True, \"id\": \"option16\"},\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Pick the best answer\",\n", | ||
" \"mlTask\": \"COMPARISON\",\n", | ||
" \"required\": 1,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"CLASSIFICATION_JOB_AT_ROUND_LEVEL\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"BOTH_ARE_GOOD\": {\"children\": [], \"name\": \"Both are good\", \"id\": \"category17\"},\n", | ||
" \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Both are bad\", \"id\": \"category18\"},\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Overall quality\",\n", | ||
" \"level\": \"round\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"CLASSIFICATION_JOB_AT_CONVERSATION_LEVEL\": {\n", | ||
" \"content\": {\n", | ||
" \"categories\": {\n", | ||
" \"GLOBAL_GOOD\": {\"children\": [], \"name\": \"Globally good\", \"id\": \"category19\"},\n", | ||
" \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Globally bad\", \"id\": \"category20\"},\n", | ||
" },\n", | ||
" \"input\": \"radio\",\n", | ||
" },\n", | ||
" \"instruction\": \"Global\",\n", | ||
" \"level\": \"conversation\",\n", | ||
" \"mlTask\": \"CLASSIFICATION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" \"TRANSCRIPTION_JOB_AT_CONVERSATION_LEVEL\": {\n", | ||
" \"content\": {\"input\": \"textField\"},\n", | ||
" \"instruction\": \"Additional comments...\",\n", | ||
" \"level\": \"conversation\",\n", | ||
" \"mlTask\": \"TRANSCRIPTION\",\n", | ||
" \"required\": 0,\n", | ||
" \"isChild\": False,\n", | ||
" \"isNew\": False,\n", | ||
" },\n", | ||
" }\n", | ||
"}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now, we create the project using the `create_project` method, with type `LLM_STATIC`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from kili.client import Kili\n", | ||
"\n", | ||
"kili = Kili(\n", | ||
" # api_endpoint=\"https://cloud.kili-technology.com/api/label/v2/graphql\",\n", | ||
")\n", | ||
"project = kili.create_project(\n", | ||
" title=\"[Kili SDK Notebook]: LLM Static\",\n", | ||
" description=\"Project Description\",\n", | ||
" input_type=\"LLM_STATIC\",\n", | ||
" json_interface=interface,\n", | ||
")\n", | ||
"project_id = project[\"id\"]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Import conversations" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We will import three conversations to the project. The conversations are stored in a JSON file, which we will load and import using the `import_conversations` method.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import requests\n", | ||
"\n", | ||
"conversations = requests.get(\n", | ||
" \"https://storage.googleapis.com/label-public-staging/demo-projects/LLM_static/llm-conversations.json\"\n", | ||
").json()\n", | ||
"kili.llm.import_conversations(project_id, conversations=conversations)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"You can now see the conversations imported in the UI :\n", | ||
"\n", | ||
"![Model Integration](./img/llm_conversations.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In this tutorial, we've:\n", | ||
"\n", | ||
"- **Created a Kili project** with a custom interface for LLM output comparison.\n", | ||
"- **Imported conversations** using Kili LLM format.\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters