feat: implement the use of tools #49

olimorris · 2024-05-05T21:12:34Z

Andrew Ng outlines brilliantly in Part 3 of Agentic Design Patterns, the power of combining tools with LLMs. From doing simple tasks such as executing code that an LLM has generated through to browsing the web for up to date information (RAG).

Below I outline the approach I've taken:

Using a system prompt, we share with the LLM the tools it has available and outline how they can be called
The LLM uses judgement to determine when and if it needs to call a tool
The LLM can call a tool via a ## tools heading and some XML:

<tool>
  <name>code_runner</name>
  <parameters>
    <inputs>
      <lang>ruby</lang>
      <code>
strings = ["Hello", "Hi there", "Greetings", "Salutations"]

strings.each do |s|
  puts s
end
      </code>
      <version>3.1.0</version>
    </inputs>
  </parameters>
</tool>

The plugin then parses the response via Tree-sitter and extracts the XML, handing it off to the appropriate tool.

Considerations

Giving an LLM the power to run commands on your machine may as well be accompanied by yelling "YOLO!". This implementation allows for any executions to take place in a remote env like Docker.

At present, this implementation keeps tools away from regular chat buffers with no ability to add them in.

As per: stevearc/dotfiles@cf3b92a

olimorris · 2024-05-06T22:32:06Z

@nuvic and @lazymaniac

As two awesome contributors, I thought this might be something you'd be interested in. It's not quite ready to test yet but the idea is to add the ability for the plugin to execute code remotely (in this case in a Docker container) and then put the output into the chat to verify if it's correct.

I'll be keen to get your ideas on features/functionality in the coming days.

olimorris · 2024-05-07T20:51:26Z

2024-05-07.21_46_04.-.WezTerm.mp4

@nuvic and @lazymaniac - A video of the progress. Would love to get your initial thoughts.

Behind the scenes, the plugin is parsing the XML that the LLM has generated and uses it to initiate a code_runner tool. However just before that, it pulls down a Docker image and runs the code from the LLM in a Docker container.

In the video we do ask the LLM to write some code in Python and Ruby and then share the outputs back with them.

lazymaniac · 2024-05-09T00:01:13Z

This looks really cool, but I think manually implementing tools and RAG might require significant work. A simple code executor is fine, but LLMs often are just stupid. Moving the RAG and code execution components to a Python app might be better in the long term since the best frameworks are written in Python (AutoGen, LangChain, LlamaIndex, CrewAI).

RAG itself is quite complex if you're aiming for good results, especially when working with source code. The workflow is as follows: load data (project code, web scraping), preprocess the data (splitting and adding metadata), enhance the data by extracting, for instance, keywords or summaries, feed it into an embedding model, store the output in a vector database, and then create a retriever with suitable post-processors like a reranker. This provides a strong foundation for RAG, but it's only the first step in retrieving relevant data that fits within the context window of LLMs.

Next is the agent flow, which can take various forms. For instance, if you want to focus on changes within a single file, it's simpler. However, if the task is larger, it might require breaking it down into smaller tasks and using more complex agent combinations, such as one agent orchestrating, another searching for documentation via web search, another managing the backlog, and so forth.

Tool usage is also quite tricky because LLMs are unreliable in this aspect. They tend to either add too much or omit critical parts of the tool schema. In such cases, self-correcting tools are helpful, where the output from the LLM is validated against the schema and asked for correction if needed.

So, to sum up, I'm not sure what the ultimate goal is. If this is where you want to stop, then it's more than enough. If you want to be more adventurous and try something more complex, it might be good to explore other options. Of course, using the frameworks mentioned earlier is not mandatory—they are just abstractions over other tools and frameworks—and many different approaches are possible.

olimorris · 2024-05-09T10:50:07Z

@lazymaniac - Thanks for such a beautifully crafted and insightful response.

I've explored LangChain and LlamaIndex in some detail and realise that this is one complex and rapidly evolving field. Something which is definitely out of scope for this plugin. In this PR I wanted to see how easily I could build in the ability to get an LLM to run external commands and then share the outputs with it for self-reflection purposes. Initially I envisage this only for basic code execution.

olimorris · 2024-05-12T22:22:35Z

Tools.mp4

olimorris added 15 commits May 3, 2024 23:08

refactor: remove calls to deprecated methods

49c522f

refactor: remove calls to deprecated vim.tbl_add_reverse_lookup

2463f63

As per: stevearc/dotfiles@cf3b92a

refactor: cleanup chat

f8b6a0c

feat: add xml parser

6fe3100

first run at implementing tools

d5d6ab3

improve cmd ts query

f45bd79

add code_runner tool

0a5f27f

chore: update tools system prompt

d88d0fe

refactor: tweak ts queries to detect tool use

f250660

fix: call tools

e61f9d0

tools running in Docker...almost

31e57ef

clean up system prompt

baaba3a

mount temp volume in docker cmd

3c2c613

tools now run and pass output back to the chat

8dc8dc9

add virtual text and handle multiple stdouts

3af2bea

olimorris added 6 commits May 6, 2024 23:45

tweak actions and virtual text

1ecac13

docs: update README.md

5e51294

move extmarks to ui

ce06e4e

fix lots of errors

9812806

add user commands

86ed585

docs: update README.md

0e7368a

olimorris added 2 commits May 9, 2024 10:27

tools are now their own strategy

b0f2b46

remove requirement for h2 headers

f06beaf

olimorris added 3 commits May 11, 2024 23:07

create a template for tools

2677226

check for pre and post cmd hooks

87269f3

improve tools template

714fe51

olimorris added 8 commits May 11, 2024 23:39

clean up code runner

74d3f0e

add TOOLS and update README

eedc302

fix: anthropic adapter can handle multiple system prompts

e5f8609

remove adapter name from the action palette

aebd6c3

fix: anthropic adapter

5228a09

xml formatting

1161135

can add tools to existing chats

2a1670a

tools can be added to the top of the buffer

dab961a

olimorris added 6 commits May 12, 2024 23:23

add video to readme.md

6f04210

chore: update README.md

d19d2de

can add tools to chat buffers with more accuracy

8e89ed5

tidy up

ba477cb

add TOOLS.md

9d0546f

chore: update README.md

b4501b8

olimorris merged commit b36d9dc into main May 15, 2024
2 checks passed

olimorris deleted the feat/assistants-run-cmds branch May 15, 2024 22:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement the use of tools #49

feat: implement the use of tools #49

olimorris commented May 5, 2024 •

edited

Loading

olimorris commented May 6, 2024

olimorris commented May 7, 2024 •

edited

Loading

lazymaniac commented May 9, 2024

olimorris commented May 9, 2024

olimorris commented May 12, 2024

feat: implement the use of tools #49

feat: implement the use of tools #49

Conversation

olimorris commented May 5, 2024 • edited Loading

Considerations

olimorris commented May 6, 2024

olimorris commented May 7, 2024 • edited Loading

lazymaniac commented May 9, 2024

olimorris commented May 9, 2024

olimorris commented May 12, 2024

olimorris commented May 5, 2024 •

edited

Loading

olimorris commented May 7, 2024 •

edited

Loading