-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a more descriptive error for text/plain files that are not UTF-8 #221
Comments
We're trying not to add validation around the backend model, but we could perhaps document this better @jba WDYT? |
We should document that UTF-8 is required, and add tests to make sure that it is indeed broken for non-UTF-8. That way if the backend learns non-UTF-8, we will know to remove the doc. |
@handsomefox can you link an example of a text file encoded in 1252 or whatever you tried, that gave you this error? |
@eliben i've retested all of these files and they've all returned From the differences from your test, I can only see that I'm using ChatSession.SendMessage instead of Model.GenerateContent, some model parameters are changed, system instruction is added and the file comes before the text but that doesn't seem like it should matter. |
I've sent #228 adding one of the files @handsomefox uploaded, it indeed errors out as described. W.r.t. updating the documentation - the natural place for this seems to be https://pkg.go.dev/github.com/google/generative-ai-go/genai#UploadFileOptions, but it already documents the MIME pointing to a link that was redirected and now doesn't seem to contain the required information. I think we should find where the supported MIME types are described now. |
Description of the bug:
I've had to recently debug an issue .txt file. We had no conversions to UTF-8 inplace and this resulted in a confusing problem, where Gemini uploads just result in:
This pretty much gives you nothing and I've spent some time to debug.
I think two things can fix this issue for future users:
client.UploadFile
can better indicate that UTF-8 is required for text files, or validate it itself.The validation in the first point is quite easy if you decode the content-type correctly, since anything that begins with text/ you can just call
utf8.Valid(b)
on.The second point I have no idea about.
You could also provide a conversion method for generic cases like
UTF16(LE/BE)
andWindows 1252/1251
since they're quite common.Actual vs expected behavior:
No response
Any other information you'd like to share?
No response
The text was updated successfully, but these errors were encountered: