Skip to content

Commit

Permalink
Merge branch 'main' of github.com:ucsb-seclab/greed into main
Browse files Browse the repository at this point in the history
  • Loading branch information
ruaronicola committed Dec 20, 2023
2 parents 3babfe0 + 0581c25 commit 31b2da2
Show file tree
Hide file tree
Showing 21 changed files with 582 additions and 103 deletions.
1 change: 1 addition & 0 deletions docs/docs/advices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# 🧐 A Few Advices
2 changes: 1 addition & 1 deletion docs/docs/core.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,4 @@ The default solver employed by greed is [Yices2](https://github.com/SRI-CSL/yice
We made this decision based on the impressive results obtained at the latest SMT solving competition and the results showed by Frank et al. in [ETHBMC](https://www.usenix.org/system/files/sec20fall_frank_prepub_0.pdf).
That being said, greed offers a modular architecure and implementing support for a new solver backend is quite straigthforward.

<< TALK ABOUT THE TIMEOUT HERE >>
By default, we give unlimited time to the solver to solve the constraints. However, it can be sometimes helpful to just timeout after a certain amount of time. You can do this by using the option `SOLVER_TIMEOUT`. Keep in mind that, when the timeout fires, the state will officially be reported as `unsat`.
2 changes: 1 addition & 1 deletion docs/docs/credits.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

greed is currently mainly maintained by [degrigis](https://github.com/degrigis) and [ruaronicola](https://github.com/ruaronicola) and the following contributors:

![contributors-img](https://contrib.rocks/image?repo=ucsb-seclab/heapster)
![contributors-img](https://contrib.rocks/image?repo=ucsb-seclab/greed)

While we are available for discussions and bug fixes (via the Issues on GitHub), at the time of writing we have limited resources to provide any kind of full-time support or feature requests. Thus, we provide the tool as it is.
Everything outside of the scope of our personal research agenda needs to be implemented by the interested party.
Expand Down
147 changes: 144 additions & 3 deletions docs/docs/examples.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# 🐣 Examples

## (1) Reaching an arbitrary CALL statement
Here we show some examples to demonstrate the usage of some of the greed APIs and how to levereage them to accomplish a few interesting tasks.

## (1) Reachibility of a CALL statement

In this example, we show how to automatically synthetize the `CALLDATA` to reach an arbitrary `CALL` statement in an unverified contract `0x204Db9Ca00912c0F9e6380342113f6A0147E6f8C` on chain.
This example demonstrates the usage of some of the greed APIs and how to use them to accomplish the task.

First, you should download the [contract bytecode](https://library.dedaub.com/ethereum/address/0x204db9ca00912c0f9e6380342113f6a0147e6f8c/bytecode) and place it in a `contract.hex` file:

Expand Down Expand Up @@ -70,6 +71,9 @@ def main():
config_greed()

block_ref = 16489409

# This is the PC of the target call
#(according to the contract.tac file in the Gigahorse analysis folder )
call_pc = "0x1f7"

# Create the greed project
Expand Down Expand Up @@ -151,4 +155,141 @@ if __name__ == "__main__":



## (2) Multi-transactions analysis
## (2) Mint a PudgyPenguin

In this example we show how one can synthetize the `CALLDATA` and the `CALLVALUE` necessary to mint() a PudgyPenguin(🐧) in the contract at `0xBd3531dA5CF5857e7CfAA92426877b022e612cf8`.

After you analyzed the [contract](https://library.dedaub.com/ethereum/address/0xbd3531da5cf5857e7cfaa92426877b022e612cf8/bytecode) as explained in the previous example, you can use the following script:

```python

import web3
import logging

from greed import Project, options
from greed.exploration_techniques import ExplorationTechnique, DirectedSearch, HeartBeat, Prioritizer, DFS
from greed.utils.extra import gen_exec_id
from greed.solver.shortcuts import *

LOGGING_FORMAT = "%(levelname)s | %(message)s"
logging.basicConfig(level=logging.INFO, format=LOGGING_FORMAT)
log = logging.getLogger("example")
log.setLevel(logging.INFO)

def config_greed():
options.GREEDY_SHA = True
options.LAZY_SOLVES = False
options.STATE_INSPECT = True
options.MAX_SHA_SIZE = 300
options.OPTIMISTIC_CALL_RESULTS = True
options.DEFAULT_EXTCODESIZE = True
options.DEFAULT_CREATE2_RESULT_ADDRESS = True
options.DEFAULT_CREATE_RESULT_ADDRESS = True
options.MATH_CONCRETIZE_SYMBOLIC_EXP_EXP = True
options.MATH_CONCRETIZE_SYMBOLIC_EXP_BASE = True

def main():

config_greed()

# 4 bytes of the mint() function
# 0 --> 3
calldata = "0x40c10f19"
block_ref = 12878195

# Create the greed project
proj = Project(target_dir="./contracts/0xBd3531dA5CF5857e7CfAA92426877b022e612cf8/")

# this is the pc of the STOP opcode in the mint function
STOP = "0x43f"

stop_stmt = proj.statement_at[STOP]

block_info = proj.w3.eth.get_block(block_ref)

# Let's set the CALLER to my account
init_ctx = {
"CALLDATA": calldata,
"CALLER": "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086",
"ORIGIN": "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086",
"ADDRESS": "0xBd3531dA5CF5857e7CfAA92426877b022e612cf8",
"NUMBER": block_info.number,
"DIFFICULTY": block_info["totalDifficulty"],
"TIMESTAMP": block_info["timestamp"]
}

xid = gen_exec_id()

# Create the entry state
entry_state = proj.factory.entry_state(
xid=xid,
init_ctx=init_ctx,
max_calldatasize=68,
partial_concrete_storage=True
)

# The second argument of mint is the "amount" of penguins to mint, we want that to be non-zero!
entry_state.add_constraint(NotEqual(entry_state.calldata[BVV(67,256)], BVV(0,8)))

# Set a constraint on the CALLVALUE
entry_state.add_constraint(BV_ULE(entry_state.ctx['CALLVALUE'], BVV(0x6000000000000000, 256)))

# When a Penguin is minted, we see a LOG4, let's setup an inspection
# point and show a message!
def hi(simgr, state):
log.debug(f'Emitted LOG4 at {state.curr_stmt.id}!')
entry_state.inspect.stop_at_stmt(stmt_name="LOG4", func=hi)

# Setting up the simulation manager
simgr = proj.factory.simgr(entry_state=entry_state)

directed_search = DirectedSearch(stop_stmt)
simgr.use_technique(directed_search)

prioritizer = Prioritizer(scoring_function=lambda s: -s.globals['directed_search_distance'])
simgr.use_technique(prioritizer)

heartbeat = HeartBeat(beat_interval=100, show_op=True)
simgr.use_technique(heartbeat)

print(f" Symbolically executing...")

while True:
try:
simgr.run(find=lambda s: s.curr_stmt.id == stop_stmt.id)
except Exception as e:
print(e)
continue

if len(simgr.found) == 1:
print(f" ✅ Found state for {stop_stmt.__internal_name__} at {stop_stmt.id}!")
state = simgr.one_found

# Fix the shas!
if len(state.sha_observed) > 0:
shas = state.sha_resolver.fix_shas()
print(f'Fixed {len(shas)} shas in the state!')

# Get a solution for the CALLDATA
calldata_sol = state.solver.eval_memory(state.calldata, length=BVV(68,256), raw=True)

# Get a solution for CALLVALUE (i.e., how much we paid for a penguin)
# (Note: Yices2 does not expose a min() function, but you can find the minimum value
# by using a bisection search)
callvalue = state.solver.eval(state.ctx['CALLVALUE'])

print(f" 📥 Calldata: {hex(bv_unsigned_value(calldata_sol))}")
print(f" 💸 Callvalue: {callvalue}")

break

elif len(simgr.found) == 0:
print(f" ❌ No state found for {stop_stmt.__internal_name__} at {stop_stmt.id}!")
break



if __name__ == "__main__":
main()

```
30 changes: 26 additions & 4 deletions docs/docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,42 +3,64 @@
# ?FAQ


## Why is it names greed?
## Why is it named greed?

We all know why people are so interested in finding vulnerabilities in smart contracts..., right? ;)

## How should "greed" be stylized?

Simple: greed, no upper case.

## Why didn't you just implement everything on top of angr?

While implementing everything on top of angr could have been an option, some of the fundamentals of EVM analysis are quite different compared to classic binaries (ARM, x86, AMD64, etc...).
We found out that building a system from scratch was actually easier rather than implementing what we needed in a complex software like angr.
Moreover, we wanted to re-use the high-quality analysis (e.g., CFG) and IR offered by Gigahorse, thus, starting from square one was an easier option rather than fighting with the angr codebase.


## Does greed support non-EVM contracts?

greed automatically supports whatever is supported by Gigahorse. Currently, Gigahorse does not support any non-EVM binary, thus, greed does not.

## Can greed work with contracts source code?

Yes. Just compile the contract, decompile it with Gigahorse et voila!


## Does greed work with Vyper contracts?

Yes. Just compile the contract, decompile it with Gigahorse et voila! ;)

## Why my symbolic execution is stuck?!

Symbolic execution is not easy, there are a lot of moving pieces to make this work.
Usually the main reason is there are too many states (i.e., state explosion), or the constraints became so complex that even state-of-the-art solvers cannot do much in solving them.
One of the main strategy to succeed with symbolic execution is to concretize as much as you can and concentrate the symbolic analysis to very well defined tasks.

In other words: hoping to just symbolically execute a contract to enumerate all the vulnerabilities is generally a lost cause.

## I have a problem, how can I get in touch with greed developers?

Just open an [issue](https://github.com/ucsb-seclab/greed/issues) on our GitHub! :)

## How do I know the correspondence between TAC Statement and original opcode?

While this is not always 100% precise, Gigahorse outputs a file called `TAC_Statement_OriginalStatement.csv`. You can find the corresponence between the TAC Statement id and the original PC (bytecode) in there.

## What is the main difference between greed and Mythril?

Mythril is a symbolic analysis tool for the detection of specific classes of vulnerabilities in smart contracts. Converesely, greed is a more flexible smart contract analysis framework that let analysts build any kind of analysis using the high-quality results offered by Gigahorse. Any analysis implemented in Mythril can be built on top of greed, the viceversa is not true.

## What is the main difference between greed and Tether?
## What is the main difference between greed and teEther?

First, teEther is a symbolic executor based on the stack-based EVM opcodes, greed is based on the TAC IR (register-based). Second, we found teEther CFG reconstruction algorithm to be quite outdated when unleashed against modern contracts. Overall, we believe greed offers a more flexible design for building efficient research prototypes.

## What is the main difference between greed and ETHBMC?


Similary to teEther, ETHBMC is based on the stack-based EVM opcodes. Moreover, we found hacking into ETHBMC to be quite cumbersome compared to the Python-based greed APIs. One could argue that ETHBMC is faster than greed (due to its Rust backend), but when comparing the tools against the same conditions, we found this implementation different to be negligable when compared to other issues in symbolic execution.

## Why did you choose to build a symbolic executor on TAC?


We believe register based analyses to be more intuitive and simple to implement. Moreover, Gigahorse gets rid of unnecessary complexity related to the stack-based nature of the EVM that really improve the usability of our tool.
Finally, having a common IR is useful because if the Dedaub team will support a new non-EVM based architecture, we can automatically support it in greed :)

Binary file added docs/docs/imgs/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# 🚀 Welcome to greed's documentation!
[![License](https://img.shields.io/github/license/Ileriayo/markdown-badges?style=for-the-badge)](https://github.com/ucsb-seclab/greed/blob/main/LICENSE) [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/ucsb-seclab/greed) ![Python3](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)


<center><img src="./imgs/logo.png" width="150" height="150" /></center>

Welcome to greed's documentation! This documentation is intended to be a quick guide to use greed: our symbolic execution engine for EVM smart contract binaries.

If greed makes you think of [angr](https://github.com/angr), is because we are actually from the same research group ([UCSB SecLab](https://seclab.cs.ucsb.edu/)).
Expand Down
Loading

0 comments on commit 31b2da2

Please sign in to comment.