Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix execution paths generated by group-size detector #160

Open
S3v3ru5 opened this issue Feb 22, 2023 · 1 comment
Open

Fix execution paths generated by group-size detector #160

S3v3ru5 opened this issue Feb 22, 2023 · 1 comment

Comments

@S3v3ru5
Copy link
Contributor

S3v3ru5 commented Feb 22, 2023

detect_missing_tx_field_validations and search_paths already address the issues with the generation of execution paths.

def detect_missing_tx_field_validations(
entry_block: "BasicBlock", checks_field: Callable[["BlockTransactionContext"], bool]
) -> List[List["BasicBlock"]]:
"""search for execution paths lacking validations on transaction fields.
`tealer.analyses.dataflow` calculates possible values a transaction field can have to successfully complete execution.
Information is computed at block level. Each block's context(`BlockTransactionContext`)
will have information on possible values for a transaction field such that
- Execution reaches to the first instruction of block `B`
- Execution reaches exit instruction of block `B` without failing.
The goal of current detectors is to find if a given transaction field(s) can have a certain vulnerable value(s).
e.g: is_updatable detector checks if the transaction can be UpdateApplication type transaction. Here, the vulnerable value for
transaction type is "UpdateApplication". if "UpdateApplication" is a possible value then the contract is considered vulnerable and is reported.
In order to find whether the given contract is vulnerable, it is sufficient to check the leaf blocks: whether transaction field can
have target value and execution reaches end of a leaf block.
However, The current output format requires "a execution path" from entry block to leaf block. And when a leaf block is vulnerable,
it does **not** mean that all execution paths ending at that leaf block are vulnerable. It only means that atleast one path ending at
that leaf block is vulnerable.
For this reason, a traversal over the CFG is performed to enumerate execution paths. And for each block in the path it is checked
whether transaction field(s) can have the target vulnerable value.
Args:
entry_block: entry basic block of the CFG
checks_field: Given a block context, should return True if the target field cannot have the vulnerable value
or else False.
e.g: For is_updatable detector, vulnerable value is `UpdateApplication`.
if `block_ctx.transaction_types` can have `UpdateApplication`, this method will
return false or else returns True.
Returns:
Returns a list of vulnerable execution paths: none of the blocks in the path check the fields.
"""
def search_paths(
bb: "BasicBlock",
current_path: List["BasicBlock"],
paths_without_check: List[List["BasicBlock"]],
current_call_stack: List[Tuple[Optional["BasicBlock"], "Subroutine"]],
current_subroutine_executed: List[List["BasicBlock"]],
) -> None:
"""
Args:
current_call_stack: list of callsub blocks and called subroutine along the current path.
e.g current_callsub_blocks = [(Bi, S1), (Bj, S2), (Bk, S3), ...]
=> 1. Bi, Bj, Bk, .. all end with callsub instruction.
2. Bi called subroutine S1. S1 contains block Bj which called subroutine S2.
S2 contains Bk which calls S3.
S1 calls S2, S2 calls S3, ...
current_subroutine_executed: list of subroutine basic blocks that were executed in the path.
e.g if call stack is [(Bi, S1), (Bj, S2), (Bk, S3)]
1. current_subroutine_executed[0] contains basic blocks of S1 that were executed before call to S2.
2. current_subroutine_executed[1] contains basic blocks of S2 that were executed before call to S3.
3. current_subroutine_executed[2] contains basic blocks of S3 that were executed upto this call.
Main purpose of this argument is to identify loops. A loop contains a back edge to a previously executed block.
if presence of loop is checked using `return True if bb in current_path else False`, detector misses paths that
does not contain a loop.
Mainly, when a subroutine is called twice in a path, the second call is treated as a back edge because the subroutine's blocks
will be present in the current path (added in the first call.)
e.g
```
callsub Sa // B0
callsub Sa // B1
int 1 // B2
return
Sa: // B3
callsub Sb
retsub // B4
Sb: // B5
retsub
```
before executing B1, current_path = [B0, B3, B5, B4], bb = B1
after executing B1, current_path = [B0, B3, B5, B4, B1], bb = B3
B3 is already present in the current_path and so is treated as a loop even though it's not.
In order to identify loops properly, basic blocks that are executed as part of the current subroutines are tracked.
For above example,
At the start of the function
bb = B0, current_path = [], current_call_stack = [(None, Main)], current_subroutine_executed = [[]]
At the end of the function before calling search_paths again
bb = B0, current_path = [B0], current_call_stack = [(None, Main)], current_subroutine_executed = [[B0]]
# B0 calls Sa. Calling a subroutine adds a item to current_call_stack and current_subroutine_executed.
Start
bb = B3, current_path = [B0], current_call_stack = [(None, Main), (B0, Sa)], current_subroutine_executed = [[B0], []]
End
bb = B3, current_path = [B0, B3], current_call_stack = [(None, Main), (B0, Sa)], current_subroutine_executed = [[B0], [B3]]
# B3 calls Sb. Calling a subroutine adds a item to current_call_stack and current_subroutine_executed.
Start
bb = B5, current_path = [B0, B3], current_call_stack = [(None, Main), (B0, Sa), (B3, Sb)], current_subroutine_executed = [[B0], [B3], []]
End
bb = B5, current_path = [B0, B3, B5], current_call_stack = [(None, Main), (B0, Sa), (B3, Sb)], current_subroutine_executed = [[B0], [B3], [B5]]
# B5 returns the execution from Sb. Returning from a subroutine removes a item from current_call_stack, current_subroutine_executed.
Start
bb = B4, current_path = [B0, B3, B5], current_call_stack = [(None, Main), (B0, Sa)], current_subroutine_executed = [[B0], [B3]]
End
bb = B4, current_path = [B0, B3, B5, B4], current_call_stack = [(None, Main), (B0, Sa)], current_subroutine_executed = [[B0], [B3, B4]]
Start
bb = B1, current_path = [B0, B3, B5, B4], current_call_stack = [(None, Main)], current_subroutine_executed = [[B0]]
End
bb = B1, current_path = [B0, B3, B5, B4, B1], current_call_stack = [(None, Main)], current_subroutine_executed = [[B0, B1]]
....
"""
# check for loops
if bb in current_subroutine_executed[-1]:
logger_detectors.debug(
f"Loop Path: current_full_path = {current_path}\n, current_subroutine_executed = {current_subroutine_executed[-1]}, current_block: {repr(bb)}"
)
return
if validated_in_block(bb, checks_field):
logger_detectors.debug(
f"Validated Path: current_full_path = {current_path}\n, current_block: {repr(bb)}"
)
return
current_path = current_path + [bb]
# leaf block
if len(bb.next) == 0:
logger_detectors.debug(f"Vulnerable Path: current_full_path = {current_path}")
paths_without_check.append(current_path)
return
# do we need to make a copy of lists in [:-1]???
current_subroutine_executed = current_subroutine_executed[:-1] + [
current_subroutine_executed[-1] + [bb]
]
if isinstance(bb.exit_instr, Callsub):
teal = bb.teal
assert teal is not None
called_subroutine = teal.subroutines[bb.exit_instr.label]
# check for recursion
already_called_subroutines = [frame[1] for frame in current_call_stack]
if called_subroutine in already_called_subroutines:
# recursion
return
current_call_stack = current_call_stack + [(bb, called_subroutine)]
current_subroutine_executed = current_subroutine_executed + [[]]
if isinstance(bb.exit_instr, Retsub):
# if this block is retsub then it returns execution to the return
# point of callsub block. return point is the next instruction after the callsub
# instruction.
(callsub_block, _) = current_call_stack[-1]
assert callsub_block is not None
return_point = callsub_block.sub_return_point
if return_point is not None and return_point in bb.next:
search_paths(
return_point,
current_path,
paths_without_check,
current_call_stack[:-1], # returning from a subroutine
current_subroutine_executed[:-1],
)
else:
for next_bb in bb.next:
search_paths(
next_bb,
current_path,
paths_without_check,
current_call_stack,
current_subroutine_executed,
)
teal = entry_block.teal
assert teal is not None
paths_without_check: List[List["BasicBlock"]] = []
search_paths(entry_block, [], paths_without_check, [(None, teal.main)], [[]])
return paths_without_check

group-size detector implements a function to generate execution paths. This function does not address the recently uncovered issues.

def _check_groupsize(
self,
bb: BasicBlock,
current_path: List[BasicBlock],
paths_without_check: List[List[BasicBlock]],
used_abs_index: bool,
) -> None:
"""Find execution paths with missing GroupSize check.
This function recursively explores the Control Flow Graph(CFG) of the
contract and reports execution paths with missing GroupSize
check.
This function is "in place", modifies arguments with the data it is
supposed to return.
Args:
bb: Current basic block being checked(whose execution is simulated.)
current_path: Current execution path being explored.
paths_without_check:
Execution paths with missing GroupSize check. This is a
"in place" argument. Vulnerable paths found by this function are
appended to this list.
used_abs_index: Should be True if absolute index in `current_path`.
"""
# check for loops
if bb in current_path:
return
if MAX_GROUP_SIZE not in bb.transaction_context.group_sizes:
# GroupSize is checked
return
if not used_abs_index:
used_abs_index = self._accessed_using_absolute_index(bb)
current_path = current_path + [bb]
if not bb.next:
# leaf block
if used_abs_index:
# accessed a field using absolute index in this path.
paths_without_check.append(current_path)
return
for next_bb in bb.next:
self._check_groupsize(next_bb, current_path, paths_without_check, used_abs_index)

The group-size could be updated to use the detect_missing_tx_field_validations

"""Detector for finding execution paths missing GroupSize check."""

from typing import List, TYPE_CHECKING
from functools import lru_cache

from tealer.detectors.abstract_detector import (
    AbstractDetector,
    DetectorClassification,
    DetectorType,
)
from tealer.teal.basic_blocks import BasicBlock
from tealer.teal.instructions.instructions import (
    Gtxn,
    Gtxna,
    Gtxnas,
    Gtxns,
    Gtxnsa,
    Gtxnsas,
)
from tealer.teal.teal import Teal

from tealer.utils.algorand_constants import MAX_GROUP_SIZE
from tealer.utils.analyses import is_int_push_ins
from tealer.analyses.utils.stack_ast_builder import construct_stack_ast, UnknownStackValue
from tealer.detectors.utils import (
    detect_missing_tx_field_validations,
    detector_terminal_description
)

if TYPE_CHECKING:
    from tealer.utils.output import SupportedOutput
    from tealer.teal.instructions.instructions import Instruction
    from tealer.teal.context.block_transaction_context import BlockTransactionContext


class MissingGroupSize(AbstractDetector):  # pylint: disable=too-few-public-methods
   # ...

    @lru_cache(maxsize=None)
    @staticmethod
    def _accessed_using_absolute_index(bb: BasicBlock) -> bool:
        """Return True if a instruction in bb access a field using absolute index

        a. gtxn t f, gtxna t f i, gtxnas t f,
        b. gtxns f, gtxnsa f i, gtxnsas f

        Instructions in (a) take transaction index as a immediate argument.
        Return True if bb contains any one of those instructions.

        Instructions in (b) take transaction index from the stack.
        `gtxns f` and `gtxnsa f i` take only one argument and it is the transaction index.
        `gtxnsas f` takes two arguments and transaction index is the first argument.
        Return True if the transaction index is pushed by an int instruction.
        """
        stack_gtxns_ins: List["Instruction"] = []
        for ins in bb.instructions:
            if isinstance(ins, (Gtxn, Gtxna, Gtxnas)):
                return True
            if isinstance(ins, (Gtxns, Gtxnsa, Gtxnsas)):
                stack_gtxns_ins.append(ins)
        if not stack_gtxns_ins:
            return False
        ast_values = construct_stack_ast(bb)
        for ins in stack_gtxns_ins:
            index_value = ast_values[ins].args[0]
            if isinstance(index_value, UnknownStackValue):
                continue
            is_int, _ = is_int_push_ins(index_value.instruction)
            if is_int:
                return True
        return False


    def detect(self) -> "SupportedOutput":
        """Detect execution paths with missing GroupSize check.

        Returns:
            ExecutionPaths instance containing the list of vulnerable execution
            paths along with name, check, impact, confidence and other detector
            information.
        """

        def checks_group_size(block_ctx: "BlockTransactionContext") -> bool:
            # return True if group-size is checked in the path
            # otherwise return False
            return MAX_GROUP_SIZE not in block_ctx.group_sizes
        
        paths_without_check: List[List[BasicBlock]] = detect_missing_tx_field_validations(
            self.teal.bbs[0], checks_group_size
        )

        # paths_without_check contain all the execution paths not checking the GroupSize
        # Only report paths which also use absolute_index

        paths_to_report: List[List[BasicBlock]] = []
        for path in paths_without_check:
            for bb in path:
                if self._accessed_using_absolute_index(bb):
                    paths_to_report.append(path)
                    break

        description = detector_terminal_description(self)
        filename = "missing_group_size"

        results = self.generate_result(paths_to_report, description, filename)
        construct_stack_ast.cache_clear()
        return results
@S3v3ru5
Copy link
Contributor Author

S3v3ru5 commented Feb 22, 2023

Above approach works as follows:

  1. Generate paths lacking group-size check
  2. Filter the result paths which uses the absolute index.

May be it's better to do the second step in the detect_missing_tx_field_validations itself for now.

Add an additional parameter to detect_missing_tx_field_validations.

 def detect_missing_tx_field_validations( 
     entry_block: "BasicBlock", checks_field: Callable[["BlockTransactionContext"], bool], satisfies_report_condition: Callable[[List["BasicBlock"], bool],
 ) -> List[List["BasicBlock"]]:

which takes a path and returns True if that path is considered vulnerable or else returns False.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant