Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in Leakreplay Probe #879

Closed
bleszily opened this issue Sep 4, 2024 · 3 comments · Fixed by #1081
Closed

Bug in Leakreplay Probe #879

bleszily opened this issue Sep 4, 2024 · 3 comments · Fixed by #1081
Labels
bug Something isn't working probes Content & activity of LLM probes

Comments

@bleszily
Copy link

bleszily commented Sep 4, 2024

I want to report an issue that was discovered in the leakreplay.py file for the leakreplay probe.
I ran Garak with the command: python -m garak -m rest --generator_option_file restConfig.json -d guardrail.BinaryGuardrailDetector --probes leakreplay [and other probes]

All other probes were successful but leakreplay threw an error.

Error: Error:

 Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in runcode
  File "/opt/bleszily/garak/garak/main.py", line 14, in <module>
    main()
  File "/opt/bleszily/garak/garak/__main.py", line 9, in main
    cli.main(sys.argv[1:])
  File "/opt/bleszily/garak/garak/cli.py", line 513, in main
    command.pxd_run(
  File "/opt/bleszily/garak/garak/command.py", line 229, in pxd_run
    pxd_h.run(
  File "/opt/bleszily/garak/garak/harnesses/pxd.py", line 61, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/opt/bleszily/garak/garak/harnesses/base.py", line 108, in run
    attempt_results = probe.probe(model)
                      ^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 219, in probe
    attempts_completed = self._execute_all(attempts_todo)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 197, in _execute_all
    result = self._execute_attempt(this_attempt)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 161, in _execute_attempt
    this_attempt = self._postprocess_hook(this_attempt)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/leakreplay.py", line 68, in postprocesshook
    attempt.messages[idx][-1]["content"] = re.sub(
                                           ^^^^^^^
  File "/root/.conda/envs/garak/lib/python3.11/re/__init.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

When I investigated the error:
I discovered an error in leakreplay.py probe [TypeError related to an expected string or bytes-like object but receiving NoneType]. I think the issue is because of an operation attempting to perform a substitution using regular expressions on a value that is unexpectedly None.
From my analysis, the error suggests that there is a lack of proper validation or error handling when handling text content that might be None:

Location: /opt/bleszily/garak/garak/probes/leakreplay.py
Function: _postprocess_hook
The use of re.sub expects a string or bytes-like object, but it receives None, leading to a TypeError.

Proposed Fix:
I think we can add a check in the leakreplay.py module to ensure that the variable used in the substitution operation is not None before attempting to process it.
Here is an update I added:

def _postprocess_hook(self, attempt: Attempt) -> Attempt:
    for idx, thread in enumerate(attempt.messages):
        # Ensure content is not None before applying regex
        if thread and thread[-1]["content"] is not None:
            attempt.messages[idx][-1]["content"] = re.sub(
                "</?name>", "", thread[-1]["content"]
            )
        else:
            # Handle None or empty thread case by logging or assigning a default string
            logging.warning(f"No content to process for message index {idx}. Setting default empty string.")
            if thread:
                attempt.messages[idx][-1]["content"] = ""
            else:
                # If thread is entirely absent, log this as it might indicate a larger issue
                logging.error(f"Thread at index {idx} is missing or malformed.")
    return attempt
Updated the _postprocess_hook method with added checks to prevent the TypeError

I had to import logging also.

After updating the leakreplay.py file with these changes, it works fine.

@leondz leondz added bug Something isn't working probes Content & activity of LLM probes labels Sep 4, 2024
@leondz
Copy link
Collaborator

leondz commented Sep 4, 2024

Thanks, will take a look

@leondz
Copy link
Collaborator

leondz commented Sep 5, 2024

We haven't reproduced this yet but it looks like it could be high priority. Will get back to you.

@leondz
Copy link
Collaborator

leondz commented Jan 15, 2025

Having trouble finding a situation that would cause a None to present there, which would have been good, but will queue up a guard against None for this anyway. Thank you.

leondz added a commit that referenced this issue Jan 16, 2025
resolves #879 

`NoneType` in attempt message history would cause a crash when
`leakreplay` rewrites that message history. Guard against `None` here.
It's unclear how a None would get in there in the first place, but the
original report hasn't had updates, so this may have been a transient
behaviour.

Thanks @bleszily


## Verification

- `python -m pytest
tests/probes/test_probes_leakreplay.py::test_leakreplay_handle_incomplete_attempt`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working probes Content & activity of LLM probes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants