From eb573e0366dd22d8f25d988a2d33eb5046df0e57 Mon Sep 17 00:00:00 2001 From: davidmezzetti <561939+davidmezzetti@users.noreply.github.com> Date: Mon, 18 Nov 2024 08:29:25 -0500 Subject: [PATCH] Add documentation for txtai agents, closes #809 --- docs/agent/configuration.md | 96 +++++++++++++++++++++++++ docs/agent/index.md | 140 ++++++++++++++++++++++++++++++++++++ docs/agent/methods.md | 4 ++ docs/api/configuration.md | 16 +++++ mkdocs.yml | 4 ++ 5 files changed, 260 insertions(+) create mode 100644 docs/agent/configuration.md create mode 100644 docs/agent/index.md create mode 100644 docs/agent/methods.md diff --git a/docs/agent/configuration.md b/docs/agent/configuration.md new file mode 100644 index 000000000..a420347b9 --- /dev/null +++ b/docs/agent/configuration.md @@ -0,0 +1,96 @@ +# Configuration + +An agent takes two main arguments, an LLM and a list of tools. + +The txtai agent framework is built with [Transformers Agents](https://huggingface.co/docs/transformers/en/agents) and additional options can be directly passed in the `Agent` constructor. + +```python +from datetime import datetime + +from txtai import Agent + +wikipedia = { + "name": "wikipedia", + "description": "Searches a Wikipedia database", + "provider": "huggingface-hub", + "container": "neuml/txtai-wikipedia" +} + +arxiv = { + "name": "arxiv", + "description": "Searches a database of scientific papers", + "provider": "huggingface-hub", + "container": "neuml/txtai-arxiv" +} + +def today() -> str: + """ + Gets the current date and time + + Returns: + current date and time + """ + + return datetime.today().isoformat() + +agent = Agent( + tools=[today, wikipedia, arxiv, "websearch"], + llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4", +) +``` + +## llm + +```yaml +llm: string|llm instance +``` + +LLM pipeline instance or a LLM path. See the [LLM pipeline](../../pipeline/text/llm) for more information. + +## tools + +```yaml +tools: list +``` + +List of tools to supply to the agent. Supports the following configurations. + +### function + +A function tool takes the following dictionary configuration. + +| Field | Description | +|:------------|:-------------------------| +| name | name of the tool | +| description | tool description | +| target | target method / callable | + +A function or callable method can also be directly supplied in the `tools` list. In this case, the fields are inferred from the method documentation. + +### embeddings + +Embeddings indexes have built-in support. Provide the following dictionary configuration to add an embeddings index as a tool. + +| Field | Description | +|:------------|:-------------------------------------------| +| name | embeddings index name | +| description | embeddings index description | +| **kwargs | Parameters to pass to [embeddings.load](../../embeddings/methods/#txtai.embeddings.Embeddings.load) | + + +### transformers + +A Transformers tool instance can be provided. Additionally, the following strings load tools directly from Transformers. + +| Tool | Description | +|:------------|:----------------------------------------------------------| +| websearch | Runs a websearch using built-in Transformers Agent tool | + + +## method + +```yaml +method: reactjson|reactcode|code +``` + +Sets the agent method. Defaults to `reactjson`. [Read more on this here](https://huggingface.co/docs/transformers/en/agents#types-of-agents). diff --git a/docs/agent/index.md b/docs/agent/index.md new file mode 100644 index 000000000..4e030e371 --- /dev/null +++ b/docs/agent/index.md @@ -0,0 +1,140 @@ +# Agent + +![agent](../images/agent.png) + +An agent automatically creates workflows to answer multi-faceted user requests. Agents iteratively prompt and/or interface with tools to +step through a process and ultimately come to an answer for a request. + +Agents excel at complex tasks where multiple tools and/or methods are required. They incorporate a level of randomness similar to different +people working on the same task. When the request is simple and/or there is a rule-based process, other methods such as RAG and Workflows +should be explored. + +The following code snippet shows a basic agent. + +```python +from datetime import datetime + +from txtai import Agent + +wikipedia = { + "name": "wikipedia", + "description": "Searches a Wikipedia database", + "provider": "huggingface-hub", + "container": "neuml/txtai-wikipedia" +} + +arxiv = { + "name": "arxiv", + "description": "Searches a database of scientific papers", + "provider": "huggingface-hub", + "container": "neuml/txtai-arxiv" +} + +def today() -> str: + """ + Gets the current date and time + + Returns: + current date and time + """ + + return datetime.today().isoformat() + +agent = Agent( + tools=[today, wikipedia, arxiv, "websearch"], + llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4", + max_iterations=10, +) +``` + +The agent above has access to a two embeddings database (Wikipedia and ArXiv) and the web. Given the user's input request, the Agent decides the best tool to solve the task. In this case, a web search is executed. + +## Example + +The first example will solve a problem with multiple data points. See below. + +```python +agent("Which city has the highest population, Boston or New York?") +``` + +This requires looking up the populations of each city before knowing how to answer the question. + +## Agentic RAG + +Standard Retrieval Augmented Generation (RAG) runs a single vector search query to obtain a context and builds a prompt with the context + input question. Agentic RAG is a more complex process that goes through multiple iterations. It can also utilize multiple databases to come to a final conclusion. + +The example below aggregates information from multiple examples and builds a report on a topic. + +```python +researcher = """ +You're an expert researcher looking to write a paper on {topic}. +Search for websites, scientific papers and Wikipedia related to the topic. +Write a report with summaries and references (with hyperlinks). +Write the text as Markdown. +""" + +agent(researcher.format(topic="alien life")) +``` + +## Agent Teams + +Agents can also be tools. This enables the concept of building "Agent Teams" to solve problems. The previous example can be rewritten as a list of agents. + +```python +from txtai import Agent, LLM + +llm = LLM("hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4") + +websearcher = Agent( + tools=["websearch"], + llm=llm, +) + +wikiman = Agent( + tools=[{ + "name": "wikipedia", + "description": "Searches a Wikipedia database", + "provider": "huggingface-hub", + "container": "neuml/txtai-wikipedia" + }], + llm=llm, +) + +researcher = Agent( + tools=[{ + "name": "arxiv", + "description": "Searches a database of scientific papers", + "provider": "huggingface-hub", + "container": "neuml/txtai-arxiv" + }], + llm=llm, +) + +agent = Agent( + tools=[{ + "name": "websearcher", + "description": "I run web searches, there is no answer a web search can't solve!", + "target": websearcher + }, { + "name": "wikiman", + "description": "Wikipedia has all the answers, I search Wikipedia and answer questions", + "target": wikiman + }, { + "name": "researcher", + "description": "I'm a science guy. I search arXiv to get all my answers.", + "target": researcher + }], + llm=llm, + max_iterations=10 +) +``` + +This provides another level of intelligence to the process. Instead of just a single tool execution, each agent-tool combination has it's own reasoning engine. + +```python +agent(""" +Work with your team and build a comprehensive report on fundamental +concepts about Signal Processing. +Write the output in Markdown. +""") +``` diff --git a/docs/agent/methods.md b/docs/agent/methods.md new file mode 100644 index 000000000..ea4267c40 --- /dev/null +++ b/docs/agent/methods.md @@ -0,0 +1,4 @@ +# Methods + +## ::: txtai.agent.Agent.__init__ +## ::: txtai.agent.Agent.__call__ diff --git a/docs/api/configuration.md b/docs/api/configuration.md index c702cfbdb..4619f9679 100644 --- a/docs/api/configuration.md +++ b/docs/api/configuration.md @@ -38,6 +38,22 @@ Determines if the input embeddings index is writable (true) or read-only (false) ### cloud [Cloud storage settings](../../embeddings/configuration/cloud) can be set under a `cloud` top level configuration group. +## Agent + +Agents are defined under a top level `agent` key. Each key under the `agent` key is the name of the agent. Constructor parameters can be passed under this key. + +The following example defines an agent. + +```yaml +agent: + researcher: + tools: + - websearch + +llm: + path: hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 +``` + ## Pipeline Pipelines are loaded as top level configuration parameters. Pipeline names are automatically detected in the YAML configuration and created upon startup. All [pipelines](../../pipeline) are supported. diff --git a/mkdocs.yml b/mkdocs.yml index 2a3b960f1..b6805022e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -71,6 +71,10 @@ nav: - Index Guide: embeddings/indexing.md - Methods: embeddings/methods.md - Query Guide: embeddings/query.md + - Agent: + - agent/index.md + - Configuration: agent/configuration.md + - Methods: agent/methods.md - Pipeline: - pipeline/index.md - Audio: