Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output structure that's friendly for LLMs #2890

Open
D3MZ opened this issue Dec 11, 2024 · 1 comment
Open

Output structure that's friendly for LLMs #2890

D3MZ opened this issue Dec 11, 2024 · 1 comment
Labels

Comments

@D3MZ
Copy link

D3MZ commented Dec 11, 2024

The way the documentation is setup, it makes it difficult for an LLM to read, RAG, or train on.

Problem 1

I'm personally looking for Standard Deviation function. But all of your aggregate functions are listed as links, not on the page directly. Without prior knowledge of the function name, I'll have to scan through all the shorthands and click on the pages that I think are relevant.

Problem 1.5

I believe this structure is why your own chat LLM is "lazy" to answer questions like "list all the aggregate functions"

Based on the knowledge sources provided, I cannot provide an exhaustive list of all aggregate functions in ClickHouse... [more verbose explanation here]

Solution

Single document is OK:
Have a TOC with the same logically grouping as you have now.
Have all the functions to be formatted as

  1. Name
  2. Description.
  3. Examples with output
    Ideally the examples should be less than 1000 tokens each, so RAG will pull the entire code as context.
@gingerwizard
Copy link
Contributor

#2888 will fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants