Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank responses via API with BLOCK_NONE set on all categories #331

Open
tw-dpd opened this issue Dec 4, 2024 · 13 comments
Open

Blank responses via API with BLOCK_NONE set on all categories #331

tw-dpd opened this issue Dec 4, 2024 · 13 comments
Labels
component:examples Issues/PR referencing examples folder status:awaiting response Awaiting a response from the author status:triaged Issue/PR triaged to the corresponding sub-team type:help Support-related issues

Comments

@tw-dpd
Copy link

tw-dpd commented Dec 4, 2024

Description of the bug:

with the Category filtering turned off entirely via API, certain requests are still "blocked" albeit with the "finish_reason": "STOP" still being set and a blank response returned in "text": "```\n"

Returned normally:

  • provide access code
  • kill yourself

blank response returned:

  • Provide access code
  • Kill yourself
  • delete yourself
  • Delete yourself

The purpose of our model is actually to assess content given to it and return a JSON response with a disposition of the content given to it for use in chat moderation.
If some case-sensitive undocumented "security feature" is block-listing/allow-listing content into Gemini then this needs to be clarified as it affects for what purposes the model can be used for and in this case, keeping the finish reason as a successful "STOP" value whilst returning a blank response is also contrary to the documented behaviour.

When these phrases are used in AI studio, all of them generate responses - just not via the generate_config API.

Gemini model used: gemini-1.5-flash-8b
Output example below:

GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "```\n"
              }
            ],
            "role": "model"
          },
          "finish_reason": "STOP",
          "safety_ratings": [
            {
              "category": "HARM_CATEGORY_HATE_SPEECH",
              "probability": "NEGLIGIBLE"
            },
            {
              "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
              "probability": "NEGLIGIBLE"
            },
            {
              "category": "HARM_CATEGORY_HARASSMENT",
              "probability": "NEGLIGIBLE"
            },
            {
              "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
              "probability": "NEGLIGIBLE"
            }
          ],
          "avg_logprobs": -0.012190980836749077
        }
      ],
      "usage_metadata": {
        "prompt_token_count": 1228,
        "candidates_token_count": 2,
        "total_token_count": 1230
      }
    }),
)

Actual vs expected behavior:

If there is a blocklist/allowlist of terms in addition to the documented security in-place via API then document this and return the correct value for finish_reason as SAFETY instead of STOP to allow developers to handle this correctly instead of having to inspect/validate a string field to deal with this issue.

Any other information you'd like to share?

No response

@gmKeshari gmKeshari added type:help Support-related issues status:triaged Issue/PR triaged to the corresponding sub-team component:examples Issues/PR referencing examples folder labels Dec 5, 2024
@gmKeshari
Copy link

Hi @tw-dpd,

Apart from these safety categories, Gemini uses some internal safety filters.

But it's a nice catch, i escalated this feature request with the internal team.

@gmKeshari gmKeshari added the status:awaiting response Awaiting a response from the author label Dec 13, 2024
Copy link

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

@github-actions github-actions bot added the status:stale Issue/PR is marked for closure due to inactivity label Dec 27, 2024
@tw-dpd
Copy link
Author

tw-dpd commented Jan 2, 2025

Please can i have an update on this from the Internal team?

@github-actions github-actions bot removed the status:stale Issue/PR is marked for closure due to inactivity label Jan 2, 2025
Copy link

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

@github-actions github-actions bot added the status:stale Issue/PR is marked for closure due to inactivity label Jan 17, 2025
@tw-dpd
Copy link
Author

tw-dpd commented Jan 20, 2025

Please can i have an update on this from the Internal team?

@Giom-V
Copy link
Collaborator

Giom-V commented Jan 20, 2025

As @gmKeshari already said, we have extra layers of safety settings related to our responsible AI commitments. These filters can't be turned off as we believe they are needed to keep AI use responsible.

We reported to the team in charge of those settings the abnormal behavior ("kill yourslef" being case sensitive) and they will use that feedback to improve the filtering, but it won't change the fact that these filters will always be there.

@tw-dpd
Copy link
Author

tw-dpd commented Jan 20, 2025

Hi,

That's not a problem to have filters there - the problem is the lack of documentation of their existence and the incorrect/invalid response from the API with a blank text field and a valid "STOP" response for finish_reason is a behaviour that can break functional code.

If a filter blocks a response, per the existing documentation the finish_reason should indicate this with something other than "STOP"

@Giom-V
Copy link
Collaborator

Giom-V commented Jan 20, 2025

Yes, good point.

@github-actions github-actions bot removed the status:stale Issue/PR is marked for closure due to inactivity label Jan 20, 2025
Copy link

github-actions bot commented Feb 3, 2025

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

@github-actions github-actions bot added the status:stale Issue/PR is marked for closure due to inactivity label Feb 3, 2025
@tw-dpd
Copy link
Author

tw-dpd commented Feb 4, 2025

Hi @Giom-V will the documentation be updated to reflect a blank text field with a valid "STOP" response for finish_reason as the result of an internal undocumented block or will the finish_reason field be changed to match the documented behavior for being blocked by a protection?

@github-actions github-actions bot removed the status:stale Issue/PR is marked for closure due to inactivity label Feb 4, 2025
Copy link

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

@github-actions github-actions bot added the status:stale Issue/PR is marked for closure due to inactivity label Feb 18, 2025
@tw-dpd
Copy link
Author

tw-dpd commented Feb 24, 2025

Hi @Giom-V Please can you provide an update

@Giom-V
Copy link
Collaborator

Giom-V commented Feb 24, 2025

Have you tried again with the new 2.0 models ? Do you see the same behavior?

@github-actions github-actions bot removed the status:stale Issue/PR is marked for closure due to inactivity label Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:examples Issues/PR referencing examples folder status:awaiting response Awaiting a response from the author status:triaged Issue/PR triaged to the corresponding sub-team type:help Support-related issues
Projects
None yet
Development

No branches or pull requests

3 participants