Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Gemini's grounding (web search) feature #240

Open
avergin opened this issue Jan 24, 2025 · 2 comments
Open

Add support for Gemini's grounding (web search) feature #240

avergin opened this issue Jan 24, 2025 · 2 comments

Comments

@avergin
Copy link
Contributor

avergin commented Jan 24, 2025

Google's Gemini AI has a grounding feature, where the model searches the web and "grounds" its response. And it's triggered via configuring the tools parameter in a specific way in the Gemini API. However, this requires a non-standard configuration of tools that is currently not supported by this library.

Currently, the tools require a %Function{} struct and functions require a name. A quick way to fix the issue may be to add a custom or native field (of map type) to the function struct and do a conditional validation and preparation based on the availability of this attribute. If this value of this field can be used to configure the tools in an open-ended way.

Furthermore, when the grounding is enabled, the model's response includes groundingMetadata information consisting of the web search details, and this may also require some changes in the %Message{}.

I can implement this fix, but guidance on the implementation approach would be helpful first.


Here is how gemini describes the grounding feature in their docs:

The Grounding with Google Search feature in the Gemini API and AI Studio can be used to improve the accuracy and recency of responses from the model. In addition to more factual responses, when Grounding with Google Search is enabled, the Gemini API returns grounding sources (in-line supporting links) and Google Search Suggestions along with the response content. The Search Suggestions point users to the search results corresponding to the grounded response.

And this is how it's called in the API:

echo '{"contents":
          [{"parts": [{"text": "What is the current Google stock price?"}]}],
      "tools": [{"google_search_retrieval": {
                  "dynamic_retrieval_config": {
                    "mode": "MODE_DYNAMIC",
                    "dynamic_threshold": 1,
                }
            }
        }
    ]
}' > request.json

curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?key=$GOOGLE_API_KEY" \
-H "Content-Type: application/json" \
-d  @request.json > response.json

cat response.json
@brainlid
Copy link
Owner

Hi @avergin!

Thanks for looking into this! After reviewing the docs and your message, it feels like this isn't a function. With the rename of "functions" to "tools", the AI vendors set themselves up to introduce new types of tools. This is a google_search_retrieval tool which has unique input requirements and unique results.

I lean towards creating a new struct type that can be passed in through the tools list. My hesitation is that this is the only example of this type of tool that I know of. And the tool wouldn't be supported on anything but Gemini.

What are you thoughts on this perspective?

@avergin
Copy link
Contributor Author

avergin commented Feb 6, 2025

This is quite a reasonable approach IMHO. I have drafted #250 to bring this functionality.

For example, prompting What is the current Google stock price? with Gemini models will return the current stock price through a web search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants