Add documentation for txtai agents, closes #809

neuml · Nov 18, 2024 · eb573e0 · eb573e0
1 parent f582bdf
commit eb573e0
Show file tree

Hide file tree

Showing 5 changed files with 260 additions and 0 deletions.
diff --git a/docs/agent/configuration.md b/docs/agent/configuration.md
@@ -0,0 +1,96 @@
+# Configuration
+
+An agent takes two main arguments, an LLM and a list of tools.
+
+The txtai agent framework is built with [Transformers Agents](https://huggingface.co/docs/transformers/en/agents) and additional options can be directly passed in the `Agent` constructor.
+
+```python
+from datetime import datetime
+
+from txtai import Agent
+
+wikipedia = {
+    "name": "wikipedia",
+    "description": "Searches a Wikipedia database",
+    "provider": "huggingface-hub",
+    "container": "neuml/txtai-wikipedia"
+}
+
+arxiv = {
+    "name": "arxiv",
+    "description": "Searches a database of scientific papers",
+    "provider": "huggingface-hub",
+    "container": "neuml/txtai-arxiv"
+}
+
+def today() -> str:
+    """
+    Gets the current date and time
+
+    Returns:
+        current date and time
+    """
+
+    return datetime.today().isoformat()
+
+agent = Agent(
+    tools=[today, wikipedia, arxiv, "websearch"],
+    llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
+)
+```
+
+## llm
+
+```yaml
+llm: string|llm instance
+```
+
+LLM pipeline instance or a LLM path. See the [LLM pipeline](../../pipeline/text/llm) for more information.
+
+## tools
+
+```yaml
+tools: list
+```
+
+List of tools to supply to the agent. Supports the following configurations.
+
+### function
+
+A function tool takes the following dictionary configuration.
+
+| Field       | Description              |
+|:------------|:-------------------------|
+| name        | name of the tool         |
+| description | tool description         |
+| target      | target method / callable |
+
+A function or callable method can also be directly supplied in the `tools` list. In this case, the fields are inferred from the method documentation.
+
+### embeddings
+
+Embeddings indexes have built-in support. Provide the following dictionary configuration to add an embeddings index as a tool.
+
+| Field       | Description                                |
+|:------------|:-------------------------------------------|
+| name        | embeddings index name                      |
+| description | embeddings index description               | 
+| **kwargs    | Parameters to pass to [embeddings.load](../../embeddings/methods/#txtai.embeddings.Embeddings.load) |
+
+
+### transformers
+
+A Transformers tool instance can be provided. Additionally, the following strings load tools directly from Transformers.
+
+| Tool        | Description                                               |
+|:------------|:----------------------------------------------------------|
+| websearch   | Runs a websearch using built-in Transformers Agent tool   |
+
+
+## method
+
+```yaml
+method: reactjson|reactcode|code
+```
+
+Sets the agent method. Defaults to `reactjson`. [Read more on this here](https://huggingface.co/docs/transformers/en/agents#types-of-agents).
diff --git a/docs/agent/index.md b/docs/agent/index.md
@@ -0,0 +1,140 @@
+# Agent
+
+![agent](../images/agent.png)
+
+An agent automatically creates workflows to answer multi-faceted user requests. Agents iteratively prompt and/or interface with tools to
+step through a process and ultimately come to an answer for a request.
+
+Agents excel at complex tasks where multiple tools and/or methods are required. They incorporate a level of randomness similar to different
+people working on the same task. When the request is simple and/or there is a rule-based process, other methods such as RAG and Workflows
+should be explored.
+
+The following code snippet shows a basic agent.
+
+```python
+from datetime import datetime
+
+from txtai import Agent
+
+wikipedia = {
+    "name": "wikipedia",
+    "description": "Searches a Wikipedia database",
+    "provider": "huggingface-hub",
+    "container": "neuml/txtai-wikipedia"
+}
+
+arxiv = {
+    "name": "arxiv",
+    "description": "Searches a database of scientific papers",
+    "provider": "huggingface-hub",
+    "container": "neuml/txtai-arxiv"
+}
+
+def today() -> str:
+    """
+    Gets the current date and time
+
+    Returns:
+        current date and time
+    """
+
+    return datetime.today().isoformat()
+
+agent = Agent(
+    tools=[today, wikipedia, arxiv, "websearch"],
+    llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
+    max_iterations=10,
+)
+```
+
+The agent above has access to a two embeddings database (Wikipedia and ArXiv) and the web. Given the user's input request, the Agent decides the best tool to solve the task. In this case, a web search is executed.
+
+## Example
+
+The first example will solve a problem with multiple data points. See below.
+
+```python
+agent("Which city has the highest population, Boston or New York?")
+```
+
+This requires looking up the populations of each city before knowing how to answer the question.
+
+## Agentic RAG
+
+Standard Retrieval Augmented Generation (RAG) runs a single vector search query to obtain a context and builds a prompt with the context + input question. Agentic RAG is a more complex process that goes through multiple iterations. It can also utilize multiple databases to come to a final conclusion.
+
+The example below aggregates information from multiple examples and builds a report on a topic.
+
+```python
+researcher = """
+You're an expert researcher looking to write a paper on {topic}.
+Search for websites, scientific papers and Wikipedia related to the topic.
+Write a report with summaries and references (with hyperlinks).
+Write the text as Markdown.
+"""
+
+agent(researcher.format(topic="alien life"))
+```
+
+## Agent Teams
+
+Agents can also be tools. This enables the concept of building "Agent Teams" to solve problems. The previous example can be rewritten as a list of agents.
+
+```python
+from txtai import Agent, LLM
+
+llm = LLM("hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4")
+
+websearcher = Agent(
+    tools=["websearch"],
+    llm=llm,
+)
+
+wikiman = Agent(
+    tools=[{
+        "name": "wikipedia",
+        "description": "Searches a Wikipedia database",
+        "provider": "huggingface-hub",
+        "container": "neuml/txtai-wikipedia"
+    }],
+    llm=llm,
+)
+
+researcher = Agent(
+    tools=[{
+        "name": "arxiv",
+        "description": "Searches a database of scientific papers",
+        "provider": "huggingface-hub",
+        "container": "neuml/txtai-arxiv"
+    }],
+    llm=llm,
+)
+
+agent = Agent(
+    tools=[{
+        "name": "websearcher",
+        "description": "I run web searches, there is no answer a web search can't solve!",
+        "target": websearcher
+    }, {
+        "name": "wikiman",
+        "description": "Wikipedia has all the answers, I search Wikipedia and answer questions",
+        "target": wikiman
+    }, {
+        "name": "researcher",
+        "description": "I'm a science guy. I search arXiv to get all my answers.",
+        "target": researcher
+    }],
+    llm=llm,
+    max_iterations=10
+)
+```
+
+This provides another level of intelligence to the process. Instead of just a single tool execution, each agent-tool combination has it's own reasoning engine.
+
+```python
+agent("""
+Work with your team and build a comprehensive report on fundamental
+concepts about Signal Processing.
+Write the output in Markdown.
+""")
+```
diff --git a/docs/agent/methods.md b/docs/agent/methods.md
@@ -0,0 +1,4 @@
+# Methods
+
+## ::: txtai.agent.Agent.__init__
+## ::: txtai.agent.Agent.__call__
diff --git a/docs/api/configuration.md b/docs/api/configuration.md
@@ -38,6 +38,22 @@ Determines if the input embeddings index is writable (true) or read-only (false)
 ### cloud
 [Cloud storage settings](../../embeddings/configuration/cloud) can be set under a `cloud` top level configuration group.
 
+## Agent
+
+Agents are defined under a top level `agent` key. Each key under the `agent` key is the name of the agent. Constructor parameters can be passed under this key.
+
+The following example defines an agent.
+
+```yaml
+agent:
+    researcher:
+        tools:
+            - websearch
+
+llm:
+    path: hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
+```
+
 ## Pipeline
 
 Pipelines are loaded as top level configuration parameters. Pipeline names are automatically detected in the YAML configuration and created upon startup. All [pipelines](../../pipeline) are supported.

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -71,6 +71,10 @@ nav:
         - Index Guide: embeddings/indexing.md
         - Methods: embeddings/methods.md
         - Query Guide: embeddings/query.md
+    - Agent:
+        - agent/index.md
+        - Configuration: agent/configuration.md
+        - Methods: agent/methods.md
     - Pipeline:
         - pipeline/index.md
         - Audio: