feat(LAB-3307): add tutorial

kili-technology · Jan 17, 2025 · b5d8c70 · b5d8c70
1 parent c0137cf
commit b5d8c70
Show file tree

Hide file tree

Showing 9 changed files with 527 additions and 9 deletions.
diff --git a/docs/sdk/tutorials/llm_project_setup.md → docs/sdk/tutorials/llm_dynamic.md b/docs/sdk/tutorials/llm_project_setup.md → docs/sdk/tutorials/llm_dynamic.md
@@ -1,5 +1,5 @@
 <!-- FILE AUTO GENERATED BY docs/utils.py DO NOT EDIT DIRECTLY -->
-<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_project_setup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_dynamic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
 
 # How to Set Up a Kili Project with a LLM Model and Create a Conversation
 
@@ -71,7 +71,7 @@ kili = Kili(
     # api_endpoint="https://cloud.kili-technology.com/api/label/v2/graphql",
 )
 project = kili.create_project(
-    title="[Kili SDK Notebook]: LLM Project",
+    title="[Kili SDK Notebook]: LLM Dynamic",
     description="Project Description",
     input_type="LLM_INSTR_FOLLOWING",
     json_interface=interface,

diff --git a/docs/sdk/tutorials/llm_static.md b/docs/sdk/tutorials/llm_static.md
diff --git a/docs/tutorials.md b/docs/tutorials.md
@@ -73,7 +73,9 @@ Webhooks are really similar to plugins, except they are self-hosted, and require
 
 ## LLM
 
-[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_project_setup/) will show you how to set up a Kili project that uses a Large Language Model (LLM), create and associate the LLM model with the project, and initiate a conversation using the Kili Python SDK.
+[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_static/) explains how to import conversations into a Kili project to annotate responses generated by a Large Language Model (LLM).
+
+[This tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/llm_dynamic/) guides you through setting up a Kili project with an integrated LLM. You'll learn how to create and link the LLM model to the project and initiate a conversation using the Kili SDK.
 
 
 ## Integrations

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -60,7 +60,9 @@ nav:
       - Exporting Project Data:
           - Exporting a Project: sdk/tutorials/export_a_kili_project.md
           - Parsing Labels: sdk/tutorials/label_parsing.md
-      - LLM Projects: sdk/tutorials/llm_project_setup.md
+      - LLM Projects:
+          - Importing Conversations: sdk/tutorials/llm_static.md
+          - Model Configuration: sdk/tutorials/llm_dynamic.md
       - Setting Up Plugins:
           - Developing Plugins: sdk/tutorials/plugins_development.md
           - Plugin Example - Programmatic QA: sdk/tutorials/plugins_example.md

diff --git a/recipes/img/llm_conversations.png b/recipes/img/llm_conversations.png
diff --git a/recipes/llm_project_setup.ipynb → recipes/llm_dynamic.ipynb b/recipes/llm_project_setup.ipynb → recipes/llm_dynamic.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_project_setup.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+    "<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_dynamic.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
    ]
   },
   {
@@ -110,7 +110,7 @@
     "    # api_endpoint=\"https://cloud.kili-technology.com/api/label/v2/graphql\",\n",
     ")\n",
     "project = kili.create_project(\n",
-    "    title=\"[Kili SDK Notebook]: LLM Project\",\n",
+    "    title=\"[Kili SDK Notebook]: LLM Dynamic\",\n",
     "    description=\"Project Description\",\n",
     "    input_type=\"LLM_INSTR_FOLLOWING\",\n",
     "    json_interface=interface,\n",

diff --git a/recipes/llm_static.ipynb b/recipes/llm_static.ipynb
@@ -0,0 +1,294 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/llm_static.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How to Set Up a Kili LLM Static project"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial you'll learn how to create and import conversations in a Kili project with a custom interface for comparing LLM outputs.\n",
+    "\n",
+    "\n",
+    "Here are the steps we will follow:\n",
+    "\n",
+    "1. Creating a Kili project with a custom interface\n",
+    "2. Import three conversations to the project"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Creating a Kili Project with a Custom Interface"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will create a Kili project with a custom interface that includes several jobs for comparing LLM outputs.\n",
+    "\n",
+    "### Defining Three Levels of Annotation Jobs\n",
+    "\n",
+    "To streamline the annotation process, we define three distinct levels of annotation jobs:\n",
+    "\n",
+    "- **Completion:** This job enables annotators to evaluate individual responses generated by LLMs. Each response is annotated separately.\n",
+    "\n",
+    "- **Round:** This job allows annotators to assess a single round of conversation, grouping all the LLM responses within that round under a single annotation.\n",
+    "\n",
+    "- **Conversation:** This job facilitates annotation at the conversation level, where the entire exchange can be evaluated as a whole.\n",
+    "\n",
+    "In this example, we use a JSON interface that incorporates classifications at all these levels, enabling comprehensive annotation:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "interface = {\n",
+    "    \"jobs\": {\n",
+    "        \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"TOO_SHORT\": {\"children\": [], \"name\": \"Too short\", \"id\": \"category1\"},\n",
+    "                    \"JUST_RIGHT\": {\"children\": [], \"name\": \"Just right\", \"id\": \"category2\"},\n",
+    "                    \"TOO_VERBOSE\": {\"children\": [], \"name\": \"Too verbose\", \"id\": \"category3\"},\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Verbosity\",\n",
+    "            \"level\": \"completion\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_1\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category4\"},\n",
+    "                    \"MINOR_ISSUES\": {\"children\": [], \"name\": \"Minor issue(s)\", \"id\": \"category5\"},\n",
+    "                    \"MAJOR_ISSUES\": {\"children\": [], \"name\": \"Major issue(s)\", \"id\": \"category6\"},\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Instructions Following\",\n",
+    "            \"level\": \"completion\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_2\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category7\"},\n",
+    "                    \"MINOR_INACCURACY\": {\n",
+    "                        \"children\": [],\n",
+    "                        \"name\": \"Minor inaccuracy\",\n",
+    "                        \"id\": \"category8\",\n",
+    "                    },\n",
+    "                    \"MAJOR_INACCURACY\": {\n",
+    "                        \"children\": [],\n",
+    "                        \"name\": \"Major inaccuracy\",\n",
+    "                        \"id\": \"category9\",\n",
+    "                    },\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Truthfulness\",\n",
+    "            \"level\": \"completion\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_3\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"NO_ISSUES\": {\"children\": [], \"name\": \"No issues\", \"id\": \"category10\"},\n",
+    "                    \"MINOR_SAFETY_CONCERN\": {\n",
+    "                        \"children\": [],\n",
+    "                        \"name\": \"Minor safety concern\",\n",
+    "                        \"id\": \"category11\",\n",
+    "                    },\n",
+    "                    \"MAJOR_SAFETY_CONCERN\": {\n",
+    "                        \"children\": [],\n",
+    "                        \"name\": \"Major safety concern\",\n",
+    "                        \"id\": \"category12\",\n",
+    "                    },\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Harmlessness/Safety\",\n",
+    "            \"level\": \"completion\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"COMPARISON_JOB\": {\n",
+    "            \"content\": {\n",
+    "                \"options\": {\n",
+    "                    \"IS_MUCH_BETTER\": {\"children\": [], \"name\": \"Is much better\", \"id\": \"option13\"},\n",
+    "                    \"IS_BETTER\": {\"children\": [], \"name\": \"Is better\", \"id\": \"option14\"},\n",
+    "                    \"IS_SLIGHTLY_BETTER\": {\n",
+    "                        \"children\": [],\n",
+    "                        \"name\": \"Is slightly better\",\n",
+    "                        \"id\": \"option15\",\n",
+    "                    },\n",
+    "                    \"TIE\": {\"children\": [], \"name\": \"Tie\", \"mutual\": True, \"id\": \"option16\"},\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Pick the best answer\",\n",
+    "            \"mlTask\": \"COMPARISON\",\n",
+    "            \"required\": 1,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"CLASSIFICATION_JOB_AT_ROUND_LEVEL\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"BOTH_ARE_GOOD\": {\"children\": [], \"name\": \"Both are good\", \"id\": \"category17\"},\n",
+    "                    \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Both are bad\", \"id\": \"category18\"},\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Overall quality\",\n",
+    "            \"level\": \"round\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"CLASSIFICATION_JOB_AT_CONVERSATION_LEVEL\": {\n",
+    "            \"content\": {\n",
+    "                \"categories\": {\n",
+    "                    \"GLOBAL_GOOD\": {\"children\": [], \"name\": \"Globally good\", \"id\": \"category19\"},\n",
+    "                    \"BOTH_ARE_BAD\": {\"children\": [], \"name\": \"Globally bad\", \"id\": \"category20\"},\n",
+    "                },\n",
+    "                \"input\": \"radio\",\n",
+    "            },\n",
+    "            \"instruction\": \"Global\",\n",
+    "            \"level\": \"conversation\",\n",
+    "            \"mlTask\": \"CLASSIFICATION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "        \"TRANSCRIPTION_JOB_AT_CONVERSATION_LEVEL\": {\n",
+    "            \"content\": {\"input\": \"textField\"},\n",
+    "            \"instruction\": \"Additional comments...\",\n",
+    "            \"level\": \"conversation\",\n",
+    "            \"mlTask\": \"TRANSCRIPTION\",\n",
+    "            \"required\": 0,\n",
+    "            \"isChild\": False,\n",
+    "            \"isNew\": False,\n",
+    "        },\n",
+    "    }\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, we create the project using the `create_project` method, with type `LLM_STATIC`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from kili.client import Kili\n",
+    "\n",
+    "kili = Kili(\n",
+    "    # api_endpoint=\"https://cloud.kili-technology.com/api/label/v2/graphql\",\n",
+    ")\n",
+    "project = kili.create_project(\n",
+    "    title=\"[Kili SDK Notebook]: LLM Static\",\n",
+    "    description=\"Project Description\",\n",
+    "    input_type=\"LLM_STATIC\",\n",
+    "    json_interface=interface,\n",
+    ")\n",
+    "project_id = project[\"id\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Import conversations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will import three conversations to the project. The conversations are stored in a JSON file, which we will load and import using the `import_conversations` method.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import requests\n",
+    "\n",
+    "conversations = requests.get(\n",
+    "    \"https://storage.googleapis.com/label-public-staging/demo-projects/LLM_static/llm-conversations.json\"\n",
+    ").json()\n",
+    "kili.llm.import_conversations(project_id, conversations=conversations)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can now see the conversations imported in the UI :\n",
+    "\n",
+    "![Model Integration](./img/llm_conversations.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial, we've:\n",
+    "\n",
+    "- **Created a Kili project** with a custom interface for LLM output comparison.\n",
+    "- **Imported conversations** using Kili LLM format.\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/src/kili/llm/services/export/export_llm_static_or_dynamic.py b/src/kili/llm/services/export/export_llm_static_or_dynamic.py
@@ -1,4 +1,4 @@
-from typing import Dict, List, cast
+from typing import Dict, List, Optional, cast
 
 from kili.adapters.kili_api_gateway.kili_api_gateway import KiliAPIGateway
 from kili.domain.llm import ChatItem, ChatItemRole, Conversation, ConversationLabel
@@ -36,7 +36,7 @@ class JobLevel:
 DEFAULT_JOB_LEVEL = JobLevel.ROUND
 
 
-def get_model_name(model_id: str, project_models: List[Dict]) -> str:
+def get_model_name(model_id: Optional[str], project_models: List[Dict]) -> Optional[str]:
     try:
         return next(
             model["configuration"]["model"] for model in project_models if model["id"] == model_id

diff --git a/tests/e2e/test_notebooks.py b/tests/e2e/test_notebooks.py
@@ -45,7 +45,8 @@ def process_notebook(notebook_filename: str) -> None:
         "recipes/importing_video_assets.ipynb",
         "recipes/inference_labels.ipynb",
         "recipes/label_parsing.ipynb",
-        "recipes/llm_project_setup.ipynb",
+        "recipes/llm_static.ipynb",
+        "recipes/llm_dynamic.ipynb",
         "recipes/medical_imaging.ipynb",
         # "recipes/ner_pre_annotations_openai.ipynb",
         "recipes/ocr_pre_annotations.ipynb",