Support for Gemini 2 Flash Thinking (#35)

- plus documentation enhancements
gradion-ai · Jan 28, 2025 · f37add8 · f37add8
1 parent 5d0e6b4
commit f37add8
Show file tree

Hide file tree

Showing 16 changed files with 235 additions and 57 deletions.
diff --git a/README.md b/README.md
@@ -94,7 +94,7 @@ https://github.com/user-attachments/assets/83cec179-54dc-456c-b647-ea98ec99600b
 
 ## Evaluation
 
-We [evaluated](evaluation) `freeact` using five state-of-the-art models:
+We [evaluated](evaluation) `freeact` with these models:
 
 - Claude 3.5 Sonnet (`claude-3-5-sonnet-20241022`)
 - Claude 3.5 Haiku (`claude-3-5-haiku-20241022`)
@@ -114,4 +114,4 @@ Interestingly, these results were achieved using zero-shot prompting in `freeact
 
 ## Supported models
 
-In addition to the models we [evaluated](#evaluation), `freeact` also supports the [integration](https://gradion-ai.github.io/freeact/integration/) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
+In addition to all [supported models](https://gradion-ai.github.io/freeact/models/), `freeact` also supports the [integration](https://gradion-ai.github.io/freeact/integration/) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
diff --git a/docs/cli.md b/docs/cli.md
@@ -1,12 +1,14 @@
-# Command-line interface
+# Command line interface
 
-`freeact` provides a minimalistic command-line interface (CLI) for running agents. It is currently intended for demonstration purposes only. [Install `freeact`](installation.md) and run the following command to see all available options:
+`freeact` provides a minimalistic command line interface (CLI) for running agents. It is currently intended for demonstration purposes only. [Install `freeact`](installation.md) and run the following command to see all available options:
 
 ```bash
 python -m freeact.cli --help
 ```
 
-or check [quickstart](quickstart.md) and [tutorials](tutorials/index.md) for usage examples.
+!!! Tip
+
+    Check [quickstart](quickstart.md), [tutorials](tutorials/index.md) or [supported models](models.md) for usage examples.
 
 ## Multiline input
 
@@ -21,18 +23,33 @@ To submit a multiline message, simply press `Enter`.
 
 ## Environment variables
 
-The CLI reads environment variables from a `.env` file in the current directory and passes them to the [execution environment](environment.md#execution-environment). API keys required for an agent's code action model must be either defined in the `.env` file, passed as command-line arguments, or directly set as variables in the shell.
+Environment variables may be required for two purposes:
+
+1. Running skill modules in the [execution environment](environment.md#execution-environment), including:
+    - [Predefined skills](https://gradion-ai.github.io/freeact-skills/)
+    - [Custom skills](tutorials/skills.md)
+2. Running code action models by `freeact` agents
+
+There are three ways to provide these environment variables:
+
+- For skill modules (purpose 1): Variables must be defined in a `.env` file in your current working directory
+- For code action models (purpose 2): Variables can be provided through:
+    - A `.env` file in the current working directory
+    - Command-line arguments
+    - Shell environment variables
+
+This is shown by example in the following two subsections.
 
 ### Example 1
 
 The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given a `.env` file with the following content:
 
 ```env title=".env"
 # Required for Claude 3.5 Sonnet
-ANTHROPIC_API_KEY=your-anthropic-api-key
+ANTHROPIC_API_KEY=...
 
 # Required for generative Google Search via Gemini 2
-GOOGLE_API_KEY=your-google-api-key
+GOOGLE_API_KEY=...
 ```
 
 the following command will launch an agent with `claude-3-5-sonnet-20241022` as code action model configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`:
@@ -49,7 +66,7 @@ The API key can alternatively be passed as command-line argument:
 ```bash
 python -m freeact.cli \
   --model-name=claude-3-5-sonnet-20241022 \
-  --api-key=your-anthropic-api-key \
+  --api-key=$ANTHROPIC_API_KEY \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
@@ -61,10 +78,10 @@ To use models from other providers, such as [accounts/fireworks/models/deepseek-
 ```env title=".env"
 # Required for DeepSeek V3 hosted by Fireworks
 DEEPSEEK_BASE_URL=https://api.fireworks.ai/inference/v1
-DEEPSEEK_API_KEY=your-deepseek-api-key
+DEEPSEEK_API_KEY=...
 
 # Required for generative Google Search via Gemini 2
-GOOGLE_API_KEY=your-google-api-key
+GOOGLE_API_KEY=...
 ```
 
 and launch the agent with
@@ -82,7 +99,7 @@ or pass the base URL and API key directly as command-line arguments:
 python -m freeact.cli \
   --model-name=accounts/fireworks/models/deepseek-v3 \
   --base-url=https://api.fireworks.ai/inference/v1 \
-  --api-key=your-deepseek-api-key \
+  --api-key=$DEEPSEEK_API_KEY \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
diff --git a/docs/evaluation.md b/docs/evaluation.md
@@ -1,6 +1,6 @@
 # Evaluation results
 
-We [evaluated](https://github.com/gradion-ai/freeact/tree/main/evaluation) `freeact` using four state-of-the-art models:
+We [evaluated](https://github.com/gradion-ai/freeact/tree/main/evaluation) `freeact` with these models:
 
 - Claude 3.5 Sonnet (`claude-3-5-sonnet-20241022`)
 - Claude 3.5 Haiku (`claude-3-5-haiku-20241022`)

diff --git a/docs/index.md b/docs/index.md
@@ -19,15 +19,15 @@ The library's architecture emphasizes extensibility and transparency, avoiding t
 ## Next steps
 
 - [Quickstart](quickstart.md) - Launch your first `freeact` agent and interact with it on the command line
-- [Installation](installation.md) - Installation instructions and configuration of execution environments
 - [Building blocks](blocks.md) - Learn about the essential components of a `freeact` agent system
-- [Tutorials](tutorials/index.md) - Tutorials demonstrating the `freeact` building blocks
+- [Tutorials](tutorials/index.md) - Tutorials demonstrating the usage of `freeact` building blocks
+- [Command line](cli.md) - Guide to using `freeact` agents from the command line
+- [Supported models](models.md) - Overview of models [evaluated](evaluation.md) with `freeact`
 
 ## Further reading
 
-- [Command line interface](cli.md) - Guide to using `freeact` agents on the command line
-- [Supported models](models.md) - Overview of models [evaluated](evaluation.md) with `freeact`
-- [Model integration](integration.md) - Guidelines for integrating new models into `freeact`
+- [Model integration](integration.md) - Guide for integrating new models into `freeact`
+- [Execution environment](environment.md) - Overview of prebuilt and custom execution environments
 - [Streaming protocol](streaming.md) - Specification for streaming model responses and execution results
 
 ## Status

diff --git a/docs/integration.md b/docs/integration.md
@@ -11,7 +11,7 @@ The low-level API is not further described here. For implementation examples, se
 
 ### High-level API
 
-The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own. Then, you can either create an instance of `GenericModel` or subclass it.
+The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.
 
 The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index) and locally with [ollama](https://ollama.com/).
 
@@ -23,10 +23,6 @@ Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct mo
 --8<-- "freeact/model/qwen/prompt.py"
 ```
 
-!!! Note
-
-    These prompt templates are still experimental. They work reasonably well for larger Qwen 2.5 Coder models, but need optimization for smaller ones.
-
 !!! Tip
 
     While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example).
@@ -54,7 +50,7 @@ Here's a Python example that uses `QwenCoder` as code action model in a `freeact
 Run it with:
 
 ```bash
-HF_TOKEN=<your-huggingface-token> python -m freeact.examples.qwen
+HF_TOKEN=... python -m freeact.examples.qwen
 ```
 
 Alternatively, use the `freeact` [CLI](cli.md) directly:
@@ -63,7 +59,7 @@ Alternatively, use the `freeact` [CLI](cli.md) directly:
 python -m freeact.cli \
   --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
   --base-url=https://api-inference.huggingface.co/v1/ \
-  --api-key=<your-huggingface-token> \
+  --api-key=$HF_TOKEN \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```

diff --git a/docs/models.md b/docs/models.md
@@ -1,19 +1,93 @@
 # Supported models
 
-The following models have been [evaluated](evaluation.md) with `freeact`:
+For the following models, `freeact` provides model-specific prompt templates.
 
-- Claude 3.5 Sonnet (20241022)
-- Claude 3.5 Haiku (20241022)
-- Gemini 2.0 Flash (experimental)
-- Qwen 2.5 Coder 32B Instruct
-- DeepSeek V3
+| Model                       | Release    | [Evaluation](evaluation.md) | Prompt |
+|-----------------------------|------------|-----------|--------------|
+| Claude 3.5 Sonnet           | 2024-10-22 | ✓         | optimized    |
+| Claude 3.5 Haiku            | 2024-10-22 | ✓         | optimized    |
+| Gemini 2.0 Flash            | 2024-12-11 | ✓         | experimental |
+| Gemini 2.0 Flash Thinking   | 2025-01-21 | ✗         | experimental |
+| Qwen 2.5 Coder 32B Instruct |            | ✓         | experimental |
+| DeepSeek V3                 |            | ✓         | experimental |
 
-For these models, `freeact` provides model-specific prompt templates. 
+!!! Info
 
-!!! Note
+    `freeact` additionally supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
 
-    In addition to the models we evaluated, `freeact` also supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
+## Command line
 
-!!! Tip
+This section demonstrates how you can launch `freeact` agents with these models from the [command line](cli.md). All agents use the [predefined](https://gradion-ai.github.io/freeact-skills/) `freeact_skills.search.google.stream.api` skill module for generative Google search. The required [Gemini](https://aistudio.google.com/app/apikey) API key for that skill must be defined in a `.env` file in the current working directory:
 
-    For best performance, we recommend Claude 3.5 Sonnet, with DeepSeek V3 as a close second. Support for Gemini 2.0 Flash, Qwen 2.5 Coder, and DeepSeek V3 remains experimental as we continue to optimize their prompt templates.
+```env title=".env"
+# Required for `freeact_skills.search.google.stream.api`
+GOOGLE_API_KEY=...
+```
+
+API keys and base URLs for code action models are provided as `--api-key` and `--base-url` arguments, respectively. Code actions are executed in a Docker container created from the [prebuilt](environment.md#prebuilt-docker-images) `ghcr.io/gradion-ai/ipybox:basic` image, passed as `--ipybox-tag` argument.
+
+!!! Info
+
+    The [CLI documentation](cli.md) covers more details how environment variables can be passed to `freeact` agent systems.
+
+### Claude 3.5 Sonnet
+
+```bash
+python -m freeact.cli \
+  --model-name=claude-3-5-sonnet-20241022 \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --api-key=$ANTHROPIC_API_KEY
+```
+
+### Claude 3.5 Haiku
+
+```bash
+python -m freeact.cli \
+  --model-name=claude-3-5-haiku-20241022 \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --api-key=$ANTHROPIC_API_KEY
+```
+
+### Gemini 2.0 Flash
+
+```bash
+python -m freeact.cli \
+  --model-name=gemini-2.0-flash-exp \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api
+  --api-key=$GOOGLE_API_KEY
+```
+
+### Gemini 2.0 Flash Thinking
+
+```bash
+python -m freeact.cli \
+  --model-name=gemini-2.0-flash-thinking-exp-01-21 \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api
+  --api-key=$GOOGLE_API_KEY
+```
+
+### Qwen 2.5 Coder 32B Instruct
+
+```bash
+python -m freeact.cli \
+  --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --base-url=https://api-inference.huggingface.co/v1/ \
+  --api-key=$HF_TOKEN
+```
+
+### DeepSeek V3
+
+```bash
+python -m freeact.cli \
+  --model-name=accounts/fireworks/models/deepseek-v3 \
+  --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --base-url=https://api.fireworks.ai/inference/v1 \
+  --api-key=$FIREWORKS_API_KEY
+```
diff --git a/docs/tutorials/basics.md b/docs/tutorials/basics.md
@@ -81,7 +81,7 @@ To use Gemini instead of Claude, run:
 --8<-- "freeact/examples/commands.txt:cli-basics-gemini"
 ```
 
-See also [this example](../cli.md#example-2) for running DeepSeek V3 or [this example](../integration.md#model-usage) for Qwen 2.5 Coder.
+See also [Supported models](../models.md) for other CLI examples.
 
 ### Example conversation
 

diff --git a/freeact/cli/__main__.py b/freeact/cli/__main__.py
@@ -6,7 +6,15 @@
 from dotenv import load_dotenv
 from rich.console import Console
 
-from freeact import Claude, CodeActAgent, CodeActModel, DeepSeek, Gemini, QwenCoder, execution_environment
+from freeact import (
+    Claude,
+    CodeActAgent,
+    CodeActModel,
+    DeepSeek,
+    Gemini,
+    QwenCoder,
+    execution_environment,
+)
 from freeact.cli.utils import read_file, stream_conversation
 
 app = typer.Typer()