Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDE Solver Strategy #669

Open
wants to merge 16 commits into
base: development
Choose a base branch
from
Open

IDE Solver Strategy #669

wants to merge 16 commits into from

Conversation

fabianbs96
Copy link
Member

Currently, the IDESolver propagates data-flow information to the successor instruction of the current instruction.
While this makes the implementation easier, for users this is sometimes unintuitive, mostly because the Dragonbook defines data-flow propagations differently: The effects of an instruction to the data flows should be visible at that same instruction already.
In addition, some information at branching statements may have already been merged with the current approach; therefore, it may not always be possible to determine the origin basicblock of a particular data-flow fact (see LLVM's invoke instruction).

This PR addresses above issues by adding the following contributions:

  • Implements a variant of the IDESolver that propagates onto the current statement instead of propagating to the successors.
  • Adds a SolverStrategy that allows to switch between both IDE solver versions via tag-dispatch.
  • Out-factors common parts of both IDE solver variants into a common base class IDESolverImpl

@fabianbs96 fabianbs96 added the enhancement New feature or request label Sep 21, 2023
@fabianbs96 fabianbs96 self-assigned this Sep 21, 2023
@fabianbs96 fabianbs96 marked this pull request as ready for review September 24, 2023 11:32
@fabianbs96 fabianbs96 requested a review from MMory as a code owner September 24, 2023 11:32
@yuffon
Copy link

yuffon commented Nov 3, 2023

Hi @fabianbs96, I have an issue with the PropagateOnto strategy.
If I use the default PropagateOver strategy, my typestate code works well.
But when using PropagateOnto strategy, some dataflow facts vanish.

The analyzed code is

#include <stdio.h>
#include <malloc.h>

int use_after_free()
{
    int *p = (int *)malloc(16);
    *p = 1;
    free(p);
    *p = 2;
    return 0;
}

int main(void)
{

    int (* p)() = & use_after_free; 
    int i = p();

    return 0;
}

The IR file is

; ModuleID = 'func-pointer-use-after-free.c'
source_filename = "func-pointer-use-after-free.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @use_after_free() #0 {
entry:
  %p = alloca i32*, align 8
  %call = call noalias i8* @malloc(i64 noundef 16) #2
  %0 = bitcast i8* %call to i32*
  store i32* %0, i32** %p, align 8
  %1 = load i32*, i32** %p, align 8
  store i32 1, i32* %1, align 4
  %2 = load i32*, i32** %p, align 8
  %3 = bitcast i32* %2 to i8*
  call void @free(i8* noundef %3) #2
  %4 = load i32*, i32** %p, align 8
  store i32 2, i32* %4, align 4
  ret i32 0
}

; Function Attrs: nounwind
declare dso_local noalias i8* @malloc(i64 noundef) #1

; Function Attrs: nounwind
declare dso_local void @free(i8* noundef) #1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %p = alloca i32 (...)*, align 8
  %i = alloca i32, align 4
  store i32 0, i32* %retval, align 4
  store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8
  %0 = load i32 (...)*, i32 (...)** %p, align 8
  %call = call i32 (...) %0()
  store i32 %call, i32* %i, align 4
  ret i32 0
}

attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { nounwind "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #2 = { nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{i32 7, !"frame-pointer", i32 2}
!3 = !{!"clang version 14.0.6 (https://github.com/llvm/llvm-project.git f28c006a5895fc0e329fe15fead81e37457cb1d1)"}

The dumped results of using PropagateOver strategy is

N: %p = alloca i32*, align 8, !psr.id !4 | ID: 0
------------------------------------------------
	D: Fact :{ Obj:11,state:0 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: %call = call noalias i8* @malloc(i64 noundef 16) #2, !psr.id !5 | ID: 1
--------------------------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:0 } | V: BOTTOM


N: %0 = bitcast i8* %call to i32*, !psr.id !6 | ID: 2
-----------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: store i32* %0, i32** %p, align 8, !psr.id !7 | ID: 3
-------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %1 = load i32*, i32** %p, align 8, !psr.id !8 | ID: 4
--------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: store i32 1, i32* %1, align 4, !psr.id !9 | ID: 5
----------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %2 = load i32*, i32** %p, align 8, !psr.id !10 | ID: 6
---------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %3 = bitcast i32* %2 to i8*, !psr.id !11 | ID: 7
---------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: call void @free(i8* noundef %3) #2, !psr.id !12 | ID: 8
----------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %4 = load i32*, i32** %p, align 8, !psr.id !13 | ID: 9
---------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:2 } | V: BOTTOM


N: store i32 2, i32* %4, align 4, !psr.id !14 | ID: 10
------------------------------------------------------
	D: Fact :{ Obj:11,state:2 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: ret i32 0, !psr.id !15 | ID: 11
----------------------------------
	D: Fact :{ Obj:11,state:3 } | V: BOTTOM
	D: Λ | V: BOTTOM


============ Results for function 'main' ============


N: %p = alloca i32 (...)*, align 8, !psr.id !17 | ID: 13
--------------------------------------------------------
	D: Fact :{ Obj:11,state:0 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: %i = alloca i32, align 4, !psr.id !18 | ID: 14
-------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: store i32 0, i32* %retval, align 4, !psr.id !19 | ID: 15
-----------------------------------------------------------
	D: Fact :{ Obj:11,state:0 } | V: TOP
	D: Λ | V: TOP


N: store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8, !psr.id !20 | ID: 16
------------------------------------------------------------------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: %0 = load i32 (...)*, i32 (...)** %p, align 8, !psr.id !21 | ID: 17
----------------------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: %call = call i32 (...) %0(), !psr.id !22 | ID: 18
----------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:0 } | V: BOTTOM


N: store i32 %call, i32* %i, align 4, !psr.id !23 | ID: 19
----------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:3 } | V: TOP


N: ret i32 0, !psr.id !24 | ID: 20
----------------------------------
	D: Fact :{ Obj:11,state:3 } | V: TOP
	D: Λ | V: TOP

Everything is OK up to now.
But with the propagateOnto strategy, the dumped results are


N: %p = alloca i32*, align 8, !psr.id !4 | ID: 0
------------------------------------------------
	D: Fact :{ Obj:11,state:0 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: %call = call noalias i8* @malloc(i64 noundef 16) #2, !psr.id !5 | ID: 1
--------------------------------------------------------------------------
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: %0 = bitcast i8* %call to i32*, !psr.id !6 | ID: 2
-----------------------------------------------------
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: store i32* %0, i32** %p, align 8, !psr.id !7 | ID: 3
-------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %1 = load i32*, i32** %p, align 8, !psr.id !8 | ID: 4
--------------------------------------------------------
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: store i32 1, i32* %1, align 4, !psr.id !9 | ID: 5
----------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM


N: %2 = load i32*, i32** %p, align 8, !psr.id !10 | ID: 6
---------------------------------------------------------
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: %3 = bitcast i32* %2 to i8*, !psr.id !11 | ID: 7
---------------------------------------------------
	D: Fact :{ Obj:11,state:1 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: call void @free(i8* noundef %3) #2, !psr.id !12 | ID: 8
----------------------------------------------------------
	D: Λ | V: BOTTOM
	D: Fact :{ Obj:11,state:2 } | V: BOTTOM


N: %4 = load i32*, i32** %p, align 8, !psr.id !13 | ID: 9
---------------------------------------------------------
	D: Fact :{ Obj:11,state:2 } | V: BOTTOM
	D: Λ | V: BOTTOM


N: store i32 2, i32* %4, align 4, !psr.id !14 | ID: 10
------------------------------------------------------
	D: Fact :{ Obj:11,state:3 } | V: BOTTOM
	D: Λ | V: BOTTOM


============ Results for function 'main' ============


N: %p = alloca i32 (...)*, align 8, !psr.id !17 | ID: 13
--------------------------------------------------------
	D: Fact :{ Obj:11,state:0 } | V: TOP
	D: Λ | V: TOP


N: %i = alloca i32, align 4, !psr.id !18 | ID: 14
-------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: store i32 0, i32* %retval, align 4, !psr.id !19 | ID: 15
-----------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8, !psr.id !20 | ID: 16
------------------------------------------------------------------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: %0 = load i32 (...)*, i32 (...)** %p, align 8, !psr.id !21 | ID: 17
----------------------------------------------------------------------
	D: Λ | V: TOP
	D: Fact :{ Obj:11,state:0 } | V: TOP


N: %call = call i32 (...) %0(), !psr.id !22 | ID: 18
----------------------------------------------------
	D: Λ | V: TOP


N: store i32 %call, i32* %i, align 4, !psr.id !23 | ID: 19
----------------------------------------------------------
	D: Λ | V: TOP

As shown above, the last two instructions after calling the function have no dataflow facts.
It seems that dataflow facts do not come out from the invoked function.
In fact, I use identity flow function for all return edges.
Besides, the default strategy works well, so it may not be the fault of flow functions.

@fabianbs96
Copy link
Member Author

fabianbs96 commented Dec 13, 2023

Hi @yuffon,

I have debugged your issue. There was indeed a bug in the PropagateOntoStrategy implementation. Can you check, whether the current version works?

In fact, I use identity flow function for all return edges.

Besides that, it is not a good idea to skip fact-mapping in the call-flow-function and in the ret-flow-function. This way you will not get context-sensitive results at the end

@yuffon
Copy link

yuffon commented Dec 14, 2023 via email

@yuffon
Copy link

yuffon commented Jul 2, 2024

Hi @yuffon,

I have debugged your issue. There was indeed a bug in the PropagateOntoStrategy implementation. Can you check, whether the current version works?

In fact, I use identity flow function for all return edges.

Besides that, it is not a good idea to skip fact-mapping in the call-flow-function and in the ret-flow-function. This way you will not get context-sensitive results at the end

Hi Fabian,
Recently, I meet one issue when using PropagateOntoStrategy for backward analysis.

My backward dataflow analysis is a specialized typestate analysis. Flow functions are as follows.

call-to-return: killall function
call flow: identity function
ret flow: identity function
summary and normal flow: advance automata.

The program structure is like the following.


//point 1
if(!callSomeFunction()){
    ......
}


void callSomeFunction(){
    //point 2
}

I have checked the dataflow results. At point 2, data flow facts are correct. But point 1 has no data flow facts. Since this is backward analysis. It seems like getRetFlowFunction or getCallFlowFunction kills all facts.
But I use identity function for both getRetFlowFunction or getCallFlowFunction.
I don't know whether this is a bug.
By the way, it seems that f-IDESolverStrategy has not been merged in to main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants