Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: General File Reader Tool #398

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tybalex
Copy link
Contributor

@tybalex tybalex commented Feb 3, 2025

This PR adds a general File-Reader Tool by reusing existing knowledge ingestion code.
This tool can parse the content of input workspace file and convert to markdown format and write to a workspace file.

This tool also helps address issues like obot-platform/obot#1405, by:

  1. the agent uses File Reader tool to convert .pdf file to .md file
  2. the agent then uses the Summarizer tool to summarize the markdown format content.

@cjellick @thedadams I wonder if gptscript could support a syntax such that File Reader Tool is always a prerequisite of Summarizer Tool.

@cjellick
Copy link
Contributor

cjellick commented Feb 4, 2025

Hm. This isn't really a file reader tool. It's a "Convert to markdown" tool. I'm not sure that's what we want.

I had envisioned a tool that reads a file and sends the contents directly to the LLM. Do you not think that approach will work well?

What happens if the user just says "Read the court filings file" or "What's the highest grossing office in my spreadsheet?" Will this tool be called?

I'm also not sure that we want to create double the artifacts in the workspace.

@tybalex
Copy link
Contributor Author

tybalex commented Feb 4, 2025

Hm. This isn't really a file reader tool. It's a "Convert to markdown" tool. I'm not sure that's what we want.

Technically this tool parses the text content from pdf/pptx/docx documents, so it is a reader. In terms of the format, maybe not necessarily markdown, parsing the content to plain text would be enough. I guess markdown format is more readable?

I had envisioned a tool that reads a file and sends the contents directly to the LLM. Do you not think that approach will work well?

We can support both, and make it sends content back to llm by default. In the case of when the file has too much text, then it should probably write to a file and use a summarizer tool to handle the text.

What happens if the user just says "Read the court filings file" or "What's the highest grossing office in my spreadsheet?" Will this tool be called?

For structured data like a spreadsheet(I mean excel/csv/json .. etc), we will need to handle it separately.

@tybalex
Copy link
Contributor Author

tybalex commented Feb 5, 2025

@cjellick I made an update, now this tool will send the content directly to LLM by default. If user specifies an output file, then it will do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants