Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow disabling optimization of return value for a function #6269

Open
SlidyBat opened this issue Dec 21, 2024 · 11 comments
Open

Allow disabling optimization of return value for a function #6269

SlidyBat opened this issue Dec 21, 2024 · 11 comments
Assignees
Labels
State: Awaiting Triage Issue is waiting for more in-depth triage from a developer

Comments

@SlidyBat
Copy link

SlidyBat commented Dec 21, 2024

What is the feature you'd like to have?
Sometimes binja thinks that a function has a constant return value, and will optimize any code that calls that function, assuming that the return value will always be that constant.

I would like to be able to disable this for specific functions since there are cases where binja mistakenly applies this optimization where it shouldn't.

Is your feature request related to a problem?
There are a few cases where this has been a problem.

The first example is for a function that uses the vmmcall instruction.
CleanShot 2024-12-21 at 20 06 13@2x
The vmmcall will cause an exit to the hypervisor, which will modify the registers, then return (similar to a syscall).

The call to vmmcall_func checks the return value of the vmmcall, and will go to an error path if it is non-zero.
CleanShot 2024-12-21 at 20 07 40@2x

However, binja doesn't know that the vmmcall can modify the registers, and assumes that eax is always zero. So the error path code is completely optimized out in the HLIL:
CleanShot 2024-12-21 at 20 09 00@2x

The second example is from a kernel that sets a fault handler then tries to read a user address. If a fault occurs, the fault handler is called which will return an error value. If no fault occurs, 0 is returned.
CleanShot 2024-12-21 at 20 10 38@2x

Again binja doesn't know that the eax register can change, so assumes this function always returns 0, and the error handling code in HLIL is optimized out.

The last example is less impactful, but I still find it a bit annoying.

Sometimes a function will simply return a global variable, and that value is used or passed to another function.

e.g.:

struct my_ops* get_global() {
  static bool initialized = false;
  if (!initialized) {
    init_global(&my_global);
    initialized = true;
  }
  return &my_global;
}

int main() {
  foo(get_global(), "hello");

  return 0;
}

Binja will try to optimize the return value by using my_global directly, resulting in this HLIL:
CleanShot 2024-12-21 at 20 16 29@2x

I think this results in more confusing code than if the return value wasn't optimized like that.

Are any alternative solutions acceptable?
For the vmmcall case, I think binja could have better default behaviour, but there will always be cases that binja can't handle, so it would be best to allow user to disable the optimization manually.

@psifertex
Copy link
Member

There's a few options already for this case I believe. First you can right click on the function and edit function properties and change the modified registers in cases where the analysis doesn't know a register is tainted.

Secondly, I believe we were discussing but I don't know if it's implemented the ability to set the global variables type to volatile. The opposite is true and you can set a variable to be const in a writable section. However either way you can create a section over that variable using the memory map view and set it to read write data.

Let me know if any of those work to solve your problem!

Documentation that touches on this is in https://dev-docs.binary.ninja/dev/concepts.html#memory-permissions-impact-on-analysis but I see I need to fix some formatting errors. If volatile works I'll add some more detailed examples.

@SlidyBat
Copy link
Author

In all 3 cases above, the relevant register is rax, which is already marked as clobbered in function properties by default since it is the return register.

For the volatile/rw permissions, I don't think that would help here. The only global variable in the code above is from the 3rd case (my_global), and that is already in r/w memory. I also tested setting it to volatile but that didn't change the HLIL.

@xusheng6
Copy link
Member

Thx for filing the issue! I am thinking of it and see what is the best solution. To start with, I think case 3) is probably not a bug -- not only our decompilation is semantically correct, I feel like it is also closer to the disassembly code. I have not seen the disassembly code yet, but I feel like it is going to be the same as what we showed. I could be wrong though -- if it is convenient for you, please share the binary with us (in public or private) so that we can take a closer look at it.

@xusheng6
Copy link
Member

xusheng6 commented Dec 23, 2024

For case 1), I think there are two possible ways to address the issue:

  1. Supporting setting a variable to volatile. We have been discussing that we should support the volatile property on a data variable that will tell the analysis to NOT use its value since it can change in an unexpected way. Now this should also be done for function variables since they can also behave in the very same way. If we already have support for it, you will be able to mark the relevant variable as volatile, and the analysis will be more conservative on the usage of its value, and hopefully that will lead to desired output. Right now, our type parser already accepts the volatile keyword, but it would not have an effect since our analysis does not yet understand it
  2. A second approach would be to outline the vmmcall instruction to its own function so that you can set clobbered registers for it which will solve the current issue. This approach is more flexible, since it could be helpful in other cases (basically you can set any properties that a function has to a single instruction). The downside is of course this will require more work to be done. Another advantage is that we might be able to have the lifter to automatically outline certain special instrucitons so that it would be easier for the user to decide whether or not to set specific property of it

@SlidyBat
Copy link
Author

SlidyBat commented Dec 23, 2024

I think case 3) is probably not a bug -- not only our decompilation is semantically correct

I agree it is semantically correct, but is not as clear as if the HLIL showed foo(get_global(), "hello").

Current HLIL makes it seem like return value of get_global() is ignored and my_global is referenced directly.

I feel like it is also closer to the disassembly code

The disassembly calls get_global() then uses the return value as an argument to foo. There is no direct reference to my_global.

CleanShot 2024-12-23 at 17 21 11@2x

Here is the binary:
retvalue_test.zip

Supporting setting a variable to volatile

To clarify, this would work by changing function signature from e.g. int vmmcall_func() to volatile int vmmcall_func() to disable optimization of return value?

I think that would be a good solution, and would probably work to solve all 3 cases.

A second approach would be to outline the vmmcall instruction

I think this could help for better default behaviour of vmmcall, but wouldn't help for the other 2 cases

@xusheng6
Copy link
Member

I think case 3) is probably not a bug -- not only our decompilation is semantically correct

I agree it is semantically correct, but is not as clear as if the HLIL showed foo(get_global(), "hello").

Current HLIL makes it seem like return value of get_global() is ignored and my_global is referenced directly.

I feel like it is also closer to the disassembly code

The disassembly calls get_global() then uses the return value as an argument to foo. There is no direct reference to my_global.

CleanShot 2024-12-23 at 17 21 11@2x

Here is the binary: retvalue_test.zip

Supporting setting a variable to volatile

To clarify, this would work by changing function signature from e.g. int vmmcall_func() to volatile int vmmcall_func() to disable optimization of return value?

I think that would be a good solution, and would probably work to solve all 3 cases.

A second approach would be to outline the vmmcall instruction

I think this could help for better default behaviour of vmmcall, but wouldn't help for the other 2 cases

  1. I still personally think our current output is more readable with the global example, but I do see your point that there might be cases where we want to see a function call rather than its return value. Maybe a function whose return value can change but our analysis is busted on it? I dunno, I will discuss with the team and see if that can be accommodated
  2. I am not suggesting to set the function as volatile (there is no way to do it in C/C++ as well). Instead, I am suggesting to set the rax variable as volatile to let the analysis know it might change unexpectedly. So we would no longer removing the false branch of if (rax == 0x0)

@xusheng6
Copy link
Member

I am not quite sure I understand the second case -- is it possible to share the binary with us?

@xusheng6 xusheng6 added the State: Awaiting Triage Issue is waiting for more in-depth triage from a developer label Dec 23, 2024
@SlidyBat
Copy link
Author

I am not quite sure I understand the second case -- is it possible to share the binary with us?

Unfortunately I can't share that exact binary, but here is a binary I made that behaves similarly by using a signal handler.

faulthandler_test.zip

Note that in main, the return value of read_user_64 is completely ignored and the puts("Failed") call is hidden.

When an invalid address is passed to read_user_64, the SIGSEGV signal handler will call on_fault, which will return a non-zero error value. Binja doesn't know about this and assumes that read_user_64 always returns 0.

@xusheng6
Copy link
Member

xusheng6 commented Dec 24, 2024

I am not quite sure I understand the second case -- is it possible to share the binary with us?

Unfortunately I can't share that exact binary, but here is a binary I made that behaves similarly by using a signal handler.

faulthandler_test.zip

Note that in main, the return value of read_user_64 is completely ignored and the puts("Failed") call is hidden.

When an invalid address is passed to read_user_64, the SIGSEGV signal handler will call on_fault, which will return a non-zero error value. Binja doesn't know about this and assumes that read_user_64 always returns 0.

Thx for the binary!

To start with, I would like to point out that this issue is NOT limited to return value optimization. I patched the code in read_user_64 to clear the ebx register, and also have the test instruction right after the read_user_64 to test the value of the ebx register. The same behavior is still observed.

Screenshot 2024-12-24 at 10 48 42 AM

Screenshot 2024-12-24 at 10 49 05 AM

faulthandler_test_patched.zip

I tested and found that setting the rbx register as "not clobbered" actually fixes the issue -- not sure if we somehow flipped a condition somewhere in the code base when we check its value

Update: setting rbx to not clobbered only appears to have worked because it tells the analysis that the function does not modify the value of rbx register. And in the caller of read_user_64, rbx happens to have undetermined value. However, if it actually has a different value, e.g., 0x42, it will be used by the test instruction, which is also NOT what we want.

@SlidyBat
Copy link
Author

I've found out that I can actually use UIDF to set the return variable of a function to UndeterminedValue to disable optimization of it.

This works for the vmmcall_func and get_global() cases, but not for read_user_64 for some reason. As a workaround for that one though, I've NOPed out the xor eax, eax instruction to prevent binja from thinking it is set to 0.

@psifertex
Copy link
Member

Yeah, sorry I'd didn't reply earlier: UIDF was going to be my suggestion as the best way to influence cases like that.

The other would be to modify the arch to change the lifting of the instruction to an intrinsic that tainted the appropriate registers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
State: Awaiting Triage Issue is waiting for more in-depth triage from a developer
Projects
None yet
Development

No branches or pull requests

4 participants