-
Hi, thanks a lot for open sourcing this system and documenting it so well! I was wondering if anyone had experiences and feedback to share on how good has been fine-tuning for improving the generated SQL? Over just doing RAG by providing relevant queries/schema/docs in the context of a non fine-trained model? I have done some experiment myself with fine-tuning chatGPT 3.5 with ~50 queries, and even when asking verbatim questions that were in the fine-tuning the answered SQL was not better (even worse sometimes) than before fine-tuning. I was not doing RAG, just pure fine-tuning to see how helpful was the fine-tuning part Also using fine-tuning with OpenAI API usually limits yourself to use GPT3 instead of 4. So it kind of felt like fine-tuning was a bit of a waste |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Based on our benchmarking, while fine-tuning does reduce token usage and latency it does not improve accuracy with GPT-3.5. We have early access to GPT-4 fine-tuning and that shows better results, though the fine-tuning cost is quite high. You should always keep the RAG elements, and Dataherald allows you to deploy a fine-tuned model within the agent framework. In terms of training data, you need ~10 samples per table at a minimum to see results. |
Beta Was this translation helpful? Give feedback.
Based on our benchmarking, while fine-tuning does reduce token usage and latency it does not improve accuracy with GPT-3.5. We have early access to GPT-4 fine-tuning and that shows better results, though the fine-tuning cost is quite high. You should always keep the RAG elements, and Dataherald allows you to deploy a fine-tuned model within the agent framework.
In terms of training data, you need ~10 samples per table at a minimum to see results.