Updated changelog (#652)

comet-ml · Nov 17, 2024 · 55313db · 55313db
1 parent e1949cd
commit 55313db
Show file tree

Hide file tree

Showing 7 changed files with 939 additions and 40 deletions.
diff --git a/apps/opik-documentation/documentation/docs/changelog.md b/apps/opik-documentation/documentation/docs/changelog.md
@@ -9,10 +9,39 @@ sidebar_label: Changelog
 
 **Opik Dashboard**:
 
-- Group experiments by datasets in the Experiment page.
-- Introduce experiment summary charts in the Experiment page:
+- Added the option to sort the projects table by `Last updated`, `Created at` and `Name` columns.
+- Updated the logic for displaying images, instead of relying on the format of the response, we now use regex rules to detect if the trace or span input includes a base64 encoded image or url.
+- Improved performance of the Traces table by truncating trace inputs and outputs if they contain base64 encoded images.
+- Fixed some issues with rendering trace input and outputs in YAML format.
+- Added grouping and charts to the experiments page:
   ![experiment summary](/img/changelog/2024-11-11/experiment_summary.png)
 
+**SDK**:
+
+- **New integration**: Anthropic integration
+
+  ```python
+  from anthropic import Anthropic, AsyncAnthropic
+  from opik.integrations.anthropic import track_anthropic
+
+  client = Anthropic()
+  client = track_anthropic(client, project_name="anthropic-example")
+
+  message = client.messages.create(
+        max_tokens=1024,
+        messages=[
+            {
+                "role": "user",
+                "content": "Tell a fact",
+            }
+        ],
+        model="claude-3-opus-20240229",
+    )
+  print(message)
+  ```
+
+- Added a new `evaluate_experiment` method in the SDK that can be used to re-score an existing experiment, learn more in the [Update experiments](/evaluation/update_existing_experiment.md) guide.
+
 ## Week of 2024-11-04
 
 **Opik Dashboard**:

diff --git a/apps/opik-documentation/documentation/docs/cookbook/quickstart_notebook.ipynb b/apps/opik-documentation/documentation/docs/cookbook/quickstart_notebook.ipynb
diff --git a/apps/opik-documentation/documentation/docs/cookbook/quickstart_notebook.md b/apps/opik-documentation/documentation/docs/cookbook/quickstart_notebook.md
diff --git a/apps/opik-documentation/documentation/docs/evaluation/evaluate_your_llm.md b/apps/opik-documentation/documentation/docs/evaluation/evaluate_your_llm.md
@@ -179,6 +179,34 @@ You can use the `experiment_config` parameter to store information about your ev
 
 ## Advanced usage
 
+### Linking prompts to experiments
+
+The [Opik prompt library](/library/prompt_management.md) can be used to version your prompt templates.
+
+When creating an Experiment, you can link the Experiment to a specific prompt version:
+
+```python
+import opik
+
+# Create a prompt
+prompt = opik.Prompt(
+    name="My prompt",
+    prompt="..."
+)
+
+# Run the evaluation
+evaluation = evaluate(
+    experiment_name="My experiment",
+    dataset=dataset,
+    task=evaluation_task,
+    scoring_metrics=[hallucination_metric],
+    prompt=prompt,
+)
+```
+
+The experiment will now be linked to the prompt allowing you to view all experiments that use a specific prompt:
+![linked prompt](/img/evaluation/linked_prompt.png)
+
 ### Logging traces to a specific project
 
 You can use the `project_name` parameter of the `evaluate` function to log evaluation traces to a specific project:
@@ -207,4 +235,5 @@ evaluation = evaluate(
 ```
 
 ### Disabling threading
+
 In order to evaluate datasets more efficiently, Opik uses multiple background threads to evaluate the dataset. If this is causing issues, you can disable these by setting `task_threads` and `scoring_threads` to `1` which will lead Opik to run all calculations in the main thread.
diff --git a/apps/opik-documentation/documentation/docs/library/prompt_management.mdx b/apps/opik-documentation/documentation/docs/library/prompt_management.mdx
@@ -61,3 +61,32 @@ prompt.format(text="Hello, world!")
 ```
 
 If you are not using the SDK, you can download a prompt by using the [REST API](/reference/rest_api/retrieve-prompt-version.api.mdx).
+
+### Linking prompts to Experiments
+
+[Experiments](/evaluation/evaluate_your_llm.md) allow you to evaluate the performance of your LLM application on a set of examples. When evaluating
+different prompts, it can be useful to link the evaluation to a specific prompt version. This can be achieved by passing the `prompt` parameter when
+creating an Experiment:
+
+```python
+import opik
+
+opik.configure()
+client = opik.Opik()
+
+# Create a prompt
+prompt = opik.Prompt(name="My prompt", prompt="...")
+
+# Run the evaluation
+evaluation = evaluate(
+    experiment_name="My experiment",
+    dataset=dataset,
+    task=evaluation_task,
+    scoring_metrics=[hallucination_metric],
+    prompt=prompt,
+)
+```
+
+The experiment will now be linked to the prompt allowing you to view all experiments that use a specific prompt:
+
+![linked prompt](/img/evaluation/linked_prompt.png)
diff --git a/apps/opik-documentation/documentation/docs/tracing/log_multimodal_traces.md b/apps/opik-documentation/documentation/docs/tracing/log_multimodal_traces.md
@@ -13,7 +13,7 @@ Opik supports multimodal traces allowing you to track not just the text input an
 
 ## Log a trace with an image using OpenAI SDK
 
-As long as your trace input or output follows the OpenAI format, images will automatically be detected and rendered in the Opik UI.
+Images logged to a trace in both base64 encoded images and as URLs are displayed in the trace sidebar.
 
 We recommend that you use the [`track_openai`](/python-sdk-reference/integrations/openai/track_openai.html) wrapper to ensure the OpenAI API call is traced correctly:
 
@@ -48,20 +48,11 @@ print(response.choices[0])
 
 ## Manually logging images
 
-If you are not using the OpenAI SDK, you can still log images to the platform. The UI will automatically detect the image and display it if the input field has a `message` attribute that follows the OpenAI format:
+If you are not using the OpenAI SDK, you can still log images to the platform. The UI will automatically detect images based on regex rules as long as the images are logged as base64 encoded images or urls ending with `.png`, `.jpg`, `.jpeg`, `.gif`, `.bmp`, `.webp`:
 
 ```json
 {
-    "messages": [
-        ...,
-        {
-            "type": "image_url",
-            "image_url": {
-                "url": "<url or base64 encoded image>"
-            }
-        }
-    ],
-    ...
+  "image": "<url or base64 encoded image>"
 }
 ```
 

diff --git a/apps/opik-documentation/documentation/static/img/evaluation/linked_prompt.png b/apps/opik-documentation/documentation/static/img/evaluation/linked_prompt.png