How good is fine-tuning with helping generating queries? #504

vemonet · 2024-05-28T10:14:49Z

vemonet
May 28, 2024

Hi, thanks a lot for open sourcing this system and documenting it so well!

I was wondering if anyone had experiences and feedback to share on how good has been fine-tuning for improving the generated SQL? Over just doing RAG by providing relevant queries/schema/docs in the context of a non fine-trained model?

I have done some experiment myself with fine-tuning chatGPT 3.5 with ~50 queries, and even when asking verbatim questions that were in the fine-tuning the answered SQL was not better (even worse sometimes) than before fine-tuning. I was not doing RAG, just pure fine-tuning to see how helpful was the fine-tuning part

Also using fine-tuning with OpenAI API usually limits yourself to use GPT3 instead of 4. So it kind of felt like fine-tuning was a bit of a waste

Answered by aazo11

May 28, 2024

Based on our benchmarking, while fine-tuning does reduce token usage and latency it does not improve accuracy with GPT-3.5. We have early access to GPT-4 fine-tuning and that shows better results, though the fine-tuning cost is quite high. You should always keep the RAG elements, and Dataherald allows you to deploy a fine-tuned model within the agent framework.

In terms of training data, you need ~10 samples per table at a minimum to see results.

View full answer

aazo11 · 2024-05-28T19:23:49Z

aazo11
May 28, 2024
Maintainer

Based on our benchmarking, while fine-tuning does reduce token usage and latency it does not improve accuracy with GPT-3.5. We have early access to GPT-4 fine-tuning and that shows better results, though the fine-tuning cost is quite high. You should always keep the RAG elements, and Dataherald allows you to deploy a fine-tuned model within the agent framework.

In terms of training data, you need ~10 samples per table at a minimum to see results.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How good is fine-tuning with helping generating queries? #504

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How good is fine-tuning with helping generating queries? #504

vemonet May 28, 2024

Replies: 1 comment

aazo11 May 28, 2024 Maintainer

vemonet
May 28, 2024

aazo11
May 28, 2024
Maintainer