Skip to content

Commit

Permalink
feat(LAB-3307): add tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
paulruelle committed Jan 17, 2025
1 parent c0137cf commit b5d8c70
Show file tree
Hide file tree
Showing 9 changed files with 527 additions and 9 deletions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!-- FILE AUTO GENERATED BY docs/utils.py DO NOT EDIT DIRECTLY -->
<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_project_setup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_dynamic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to Set Up a Kili Project with a LLM Model and Create a Conversation

Expand Down Expand Up @@ -71,7 +71,7 @@ kili = Kili(
# api_endpoint="https://cloud.kili-technology.com/api/label/v2/graphql",
)
project = kili.create_project(
title="[Kili SDK Notebook]: LLM Project",
title="[Kili SDK Notebook]: LLM Dynamic",
description="Project Description",
input_type="LLM_INSTR_FOLLOWING",
json_interface=interface,
Expand Down
219 changes: 219 additions & 0 deletions docs/sdk/tutorials/llm_static.md

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion docs/tutorials.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,9 @@ Webhooks are really similar to plugins, except they are self-hosted, and require

## LLM

[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_project_setup/) will show you how to set up a Kili project that uses a Large Language Model (LLM), create and associate the LLM model with the project, and initiate a conversation using the Kili Python SDK.
[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_static/) explains how to import conversations into a Kili project to annotate responses generated by a Large Language Model (LLM).

[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_dynamic/) guides you through setting up a Kili project with an integrated LLM. You'll learn how to create and link the LLM model to the project and initiate a conversation using the Kili SDK.


## Integrations
Expand Down
4 changes: 3 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,9 @@ nav:
- Exporting Project Data:
- Exporting a Project: sdk/tutorials/export_a_kili_project.md
- Parsing Labels: sdk/tutorials/label_parsing.md
- LLM Projects: sdk/tutorials/llm_project_setup.md
- LLM Projects:
- Importing Conversations: sdk/tutorials/llm_static.md
- Model Configuration: sdk/tutorials/llm_dynamic.md
- Setting Up Plugins:
- Developing Plugins: sdk/tutorials/plugins_development.md
- Plugin Example - Programmatic QA: sdk/tutorials/plugins_example.md
Expand Down
Binary file added recipes/img/llm_conversations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions recipes/llm_project_setup.ipynb → recipes/llm_dynamic.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_project_setup.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
"<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_dynamic.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
Expand Down Expand Up @@ -110,7 +110,7 @@
" # api_endpoint=\"https://cloud.kili-technology.com/api/label/v2/graphql\",\n",
")\n",
"project = kili.create_project(\n",
" title=\"[Kili SDK Notebook]: LLM Project\",\n",
" title=\"[Kili SDK Notebook]: LLM Dynamic\",\n",
" description=\"Project Description\",\n",
" input_type=\"LLM_INSTR_FOLLOWING\",\n",
" json_interface=interface,\n",
Expand Down
294 changes: 294 additions & 0 deletions recipes/llm_static.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_static.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to Set Up a Kili LLM Static project"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial you'll learn how to create and import conversations in a Kili project with a custom interface for comparing LLM outputs.\n",
"\n",
"\n",
"Here are the steps we will follow:\n",
"\n",
"1. Creating a Kili project with a custom interface\n",
"2. Import three conversations to the project"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating a Kili Project with a Custom Interface"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will create a Kili project with a custom interface that includes several jobs for comparing LLM outputs.\n",
"\n",
"### Defining Three Levels of Annotation Jobs\n",
"\n",
"To streamline the annotation process, we define three distinct levels of annotation jobs:\n",
"\n",
"- **Completion:** This job enables annotators to evaluate individual responses generated by LLMs. Each response is annotated separately.\n",
"\n",
"- **Round:** This job allows annotators to assess a single round of conversation, grouping all the LLM responses within that round under a single annotation.\n",
"\n",
"- **Conversation:** This job facilitates annotation at the conversation level, where the entire exchange can be evaluated as a whole.\n",
"\n",
"In this example, we use a JSON interface that incorporates classifications at all these levels, enabling comprehensive annotation:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"interface = {\n",
" \"jobs\": {\n",
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"TOO_SHORT\": {\"children\": [], \"name\": \"Too short\", \"id\": \"category1\"},\n",
" \"JUST_RIGHT\": {\"children\": [], \"name\": \"Just right\", \"id\": \"category2\"},\n",
" \"TOO_VERBOSE\": {\"children\": [], \"name\": \"Too verbose\", \"id\": \"category3\"},\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Verbosity\",\n",
" \"level\": \"completion\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_1\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category4\"},\n",
" \"MINOR_ISSUES\": {\"children\": [], \"name\": \"Minor issue(s)\", \"id\": \"category5\"},\n",
" \"MAJOR_ISSUES\": {\"children\": [], \"name\": \"Major issue(s)\", \"id\": \"category6\"},\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Instructions Following\",\n",
" \"level\": \"completion\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_2\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category7\"},\n",
" \"MINOR_INACCURACY\": {\n",
" \"children\": [],\n",
" \"name\": \"Minor inaccuracy\",\n",
" \"id\": \"category8\",\n",
" },\n",
" \"MAJOR_INACCURACY\": {\n",
" \"children\": [],\n",
" \"name\": \"Major inaccuracy\",\n",
" \"id\": \"category9\",\n",
" },\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Truthfulness\",\n",
" \"level\": \"completion\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_3\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category10\"},\n",
" \"MINOR_SAFETY_CONCERN\": {\n",
" \"children\": [],\n",
" \"name\": \"Minor safety concern\",\n",
" \"id\": \"category11\",\n",
" },\n",
" \"MAJOR_SAFETY_CONCERN\": {\n",
" \"children\": [],\n",
" \"name\": \"Major safety concern\",\n",
" \"id\": \"category12\",\n",
" },\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Harmlessness/Safety\",\n",
" \"level\": \"completion\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"COMPARISON_JOB\": {\n",
" \"content\": {\n",
" \"options\": {\n",
" \"IS_MUCH_BETTER\": {\"children\": [], \"name\": \"Is much better\", \"id\": \"option13\"},\n",
" \"IS_BETTER\": {\"children\": [], \"name\": \"Is better\", \"id\": \"option14\"},\n",
" \"IS_SLIGHTLY_BETTER\": {\n",
" \"children\": [],\n",
" \"name\": \"Is slightly better\",\n",
" \"id\": \"option15\",\n",
" },\n",
" \"TIE\": {\"children\": [], \"name\": \"Tie\", \"mutual\": True, \"id\": \"option16\"},\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Pick the best answer\",\n",
" \"mlTask\": \"COMPARISON\",\n",
" \"required\": 1,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"CLASSIFICATION_JOB_AT_ROUND_LEVEL\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"BOTH_ARE_GOOD\": {\"children\": [], \"name\": \"Both are good\", \"id\": \"category17\"},\n",
" \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Both are bad\", \"id\": \"category18\"},\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Overall quality\",\n",
" \"level\": \"round\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"CLASSIFICATION_JOB_AT_CONVERSATION_LEVEL\": {\n",
" \"content\": {\n",
" \"categories\": {\n",
" \"GLOBAL_GOOD\": {\"children\": [], \"name\": \"Globally good\", \"id\": \"category19\"},\n",
" \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Globally bad\", \"id\": \"category20\"},\n",
" },\n",
" \"input\": \"radio\",\n",
" },\n",
" \"instruction\": \"Global\",\n",
" \"level\": \"conversation\",\n",
" \"mlTask\": \"CLASSIFICATION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" \"TRANSCRIPTION_JOB_AT_CONVERSATION_LEVEL\": {\n",
" \"content\": {\"input\": \"textField\"},\n",
" \"instruction\": \"Additional comments...\",\n",
" \"level\": \"conversation\",\n",
" \"mlTask\": \"TRANSCRIPTION\",\n",
" \"required\": 0,\n",
" \"isChild\": False,\n",
" \"isNew\": False,\n",
" },\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we create the project using the `create_project` method, with type `LLM_STATIC`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from kili.client import Kili\n",
"\n",
"kili = Kili(\n",
" # api_endpoint=\"https://cloud.kili-technology.com/api/label/v2/graphql\",\n",
")\n",
"project = kili.create_project(\n",
" title=\"[Kili SDK Notebook]: LLM Static\",\n",
" description=\"Project Description\",\n",
" input_type=\"LLM_STATIC\",\n",
" json_interface=interface,\n",
")\n",
"project_id = project[\"id\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import conversations"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will import three conversations to the project. The conversations are stored in a JSON file, which we will load and import using the `import_conversations` method.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"\n",
"conversations = requests.get(\n",
" \"https://storage.googleapis.com/label-public-staging/demo-projects/LLM_static/llm-conversations.json\"\n",
").json()\n",
"kili.llm.import_conversations(project_id, conversations=conversations)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now see the conversations imported in the UI :\n",
"\n",
"![Model Integration](./img/llm_conversations.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we've:\n",
"\n",
"- **Created a Kili project** with a custom interface for LLM output comparison.\n",
"- **Imported conversations** using Kili LLM format.\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
4 changes: 2 additions & 2 deletions src/kili/llm/services/export/export_llm_static_or_dynamic.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Dict, List, cast
from typing import Dict, List, Optional, cast

from kili.adapters.kili_api_gateway.kili_api_gateway import KiliAPIGateway
from kili.domain.llm import ChatItem, ChatItemRole, Conversation, ConversationLabel
Expand Down Expand Up @@ -36,7 +36,7 @@ class JobLevel:
DEFAULT_JOB_LEVEL = JobLevel.ROUND


def get_model_name(model_id: str, project_models: List[Dict]) -> str:
def get_model_name(model_id: Optional[str], project_models: List[Dict]) -> Optional[str]:
try:
return next(
model["configuration"]["model"] for model in project_models if model["id"] == model_id
Expand Down
3 changes: 2 additions & 1 deletion tests/e2e/test_notebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ def process_notebook(notebook_filename: str) -> None:
"recipes/importing_video_assets.ipynb",
"recipes/inference_labels.ipynb",
"recipes/label_parsing.ipynb",
"recipes/llm_project_setup.ipynb",
"recipes/llm_static.ipynb",
"recipes/llm_dynamic.ipynb",
"recipes/medical_imaging.ipynb",
# "recipes/ner_pre_annotations_openai.ipynb",
"recipes/ocr_pre_annotations.ipynb",
Expand Down

0 comments on commit b5d8c70

Please sign in to comment.