Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New RZIL integration #1663

Merged
merged 8 commits into from
Oct 9, 2021
Merged

New RZIL integration #1663

merged 8 commits into from
Oct 9, 2021

Conversation

Heersin
Copy link
Member

@Heersin Heersin commented Sep 13, 2021

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the documentation and the rizin book with the relevant information (if needed)

Detailed description

  1. New IL as an independent library which relies on rz_util only.
  2. IL in core analysis are moved to core/cil.c. (Related pr of moving esil -> Reorganize esil directory #1360 )
  3. Change the tech of collecting stats and trace in the old ESIL
  4. Fix some bugs in the old pr New Rizin IL integration #1361 New Rizin IL integration #1451

Test plan

Should pass the old rizin tests.

Closing issues

closes #277

More Info
0. Use tree-like approach

  1. This pr created for asan build to detect bug.
  2. Only enabled in analysis_bf.c.
  3. Register arena changes in other module (such as debug) are not covered in current new IL.

@Heersin
Copy link
Member Author

Heersin commented Sep 13, 2021

The trace and stats techs of the old index approach replies on temporary value. Now it has been removed. We need to implement the trace/stats in the rzil. It aims to collect info reg/mem read/write.

Currently, the API of opcode evaluation is :
rzil_parse_root(RzILVM *vm, RzILOp *op);

it's a recursive function to evaluate the expression. so we may add an extra arg as Callback function
rzil_parse_root(RzILVM *vm, RzILOp *op, ParseCallbacks f);

the callback function ccould be

rzil_focus_trace (RzILOp *op)
 - mem read
 - mem write
 - reg read
 - reg write

etc.

but what I worry about is the possible header conflict.
Since our new trace info structure defined in higher level of rizin project while the rzil can only rely on rz_util

we may not be able to directly use the function and trace info struct in rz_analysis.h

  1. Defined a temporary struct to store the collected info ?
  2. Implement the similar parse/emulate function in the same level or higher one. As esil did.

That's two possible solutions I have for now. Do you have any idea ?

@XVilka
Copy link
Member

XVilka commented Sep 13, 2021

@Heersin I think adding some new structure is a way to go. Not sure what @ret2libc @thestr4ng3r @wargio think.

Copy link
Member

@wargio wargio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good with these new changes.

core->dbg->trace = rz_debug_trace_new();
core->analysis->esil->trace = rz_analysis_esil_trace_new(core->analysis->esil);
core->analysis->rzil->trace = rz_analysis_rzil_trace_new(core->analysis, core->analysis->rzil);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have core->analysis why passing core->analysis->rzil ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it passes these 2 because I followed the ESIL api. We can only maintain one argument core->analysis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lets skip and use just core->analysis for both.

Comment on lines +469 to +467
ut8 code[32];
// analysis current data to trigger rzil_set_op_code
(void)rz_io_read_at_mapped(core->io, addr, code, sizeof(code));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you are ignoring the rz_io_read_at_mapped return value, maybe it should be better to initialize code to 0 or any other value that represent NOP or invalid op.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend not to ignore the value and do the function exit on the error

librz/il/definitions/bag.c Outdated Show resolved Hide resolved
librz/il/definitions/bool.c Outdated Show resolved Hide resolved
librz/il/definitions/effect.c Outdated Show resolved Hide resolved
librz/il/theory_effect.c Outdated Show resolved Hide resolved
librz/il/theory_init.c Outdated Show resolved Hide resolved
librz/il/vm_layer.c Outdated Show resolved Hide resolved
librz/il/vm_layer.c Outdated Show resolved Hide resolved
librz/include/rz_analysis.h Show resolved Hide resolved
@ret2libc
Copy link
Member

The trace and stats techs of the old index approach replies on temporary value. Now it has been removed. We need to implement the trace/stats in the rzil. It aims to collect info reg/mem read/write.

Currently, the API of opcode evaluation is :
rzil_parse_root(RzILVM *vm, RzILOp *op);

it's a recursive function to evaluate the expression. so we may add an extra arg as Callback function
rzil_parse_root(RzILVM *vm, RzILOp *op, ParseCallbacks f);

I'm not 100% sure I understood the problem, but I think having a callback that takes the ILOp as argument (maybe also the RzILVM as well) should be enough to support all kinds of traversal of the OPs without putting anything specific in the RzIL module.

@Heersin Heersin force-pushed the asan_integrate_new_il branch from 32f7093 to f0fc2f0 Compare September 15, 2021 13:51
Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall in a good shape now. Only a few minor nitpicks left, and should be ready to merge. We can improve the implementation further after merging, IMHO.

librz/il/vm_layer.c Show resolved Hide resolved
librz/il/rzil_vm.c Outdated Show resolved Hide resolved
librz/il/rzil_vm.c Outdated Show resolved Hide resolved
librz/il/rzil_vm.c Outdated Show resolved Hide resolved
librz/il/rzil_vm.c Outdated Show resolved Hide resolved
@Heersin Heersin force-pushed the asan_integrate_new_il branch from f0fc2f0 to 6f1372a Compare September 16, 2021 16:18
@XVilka
Copy link
Member

XVilka commented Sep 24, 2021

@Heersin could you please rebase on the latest dev and fix some unaddressed comments - there is a conflict now?

@Heersin Heersin force-pushed the asan_integrate_new_il branch from 6f1372a to 7da8ec3 Compare September 24, 2021 06:09
librz/core/cmd_analysis.c Show resolved Hide resolved
*/
struct rzil_op_int_t {
ut32 length; ///< s -- sort(type), length of bitvector
int value; ///< x -- value of bitvector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this rather just contain a single RzILBitVector and be called something like RzILOpBitVector?


// Memory
RZIL_OP_LOAD,
RZIL_OP_STORE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't RZIL_OP_LOAD actually be a bit vector op and RZIL_OP_STORE be an effect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the operand and effect are separated in this case, see the original implementation:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, where are the operands in this link?

librz/include/rz_il/rzil_opcodes.h Outdated Show resolved Hide resolved
@ret2libc
Copy link
Member

There's a lot of work here, nice job. I don't want to be tooo picky here, as otherwise we'll never merge this. I think there are however just few more things to adjust, in particular the void * thing as mentioned also by @thestr4ng3r .

librz/analysis/rzil/rzil.c Show resolved Hide resolved
librz/analysis/rzil/rzil.c Outdated Show resolved Hide resolved
librz/core/cil.c Outdated Show resolved Hide resolved
librz/core/cmd_analysis.c Outdated Show resolved Hide resolved
librz/il/vm_layer.c Outdated Show resolved Hide resolved
librz/il/rzil_vm.c Show resolved Hide resolved
librz/include/rz_il/rzil_vm.h Outdated Show resolved Hide resolved
librz/il/rzil_vm.c Outdated Show resolved Hide resolved
librz/include/rz_il/rzil_vm.h Outdated Show resolved Hide resolved
librz/include/rz_il/definitions/value.h Outdated Show resolved Hide resolved
@XVilka
Copy link
Member

XVilka commented Oct 3, 2021

@Heersin could you please rebase your PR and address the feedback?

@Heersin
Copy link
Member Author

Heersin commented Oct 5, 2021

@Heersin could you please rebase your PR and address the feedback?

Sure, I will review it tonight

@Heersin Heersin force-pushed the asan_integrate_new_il branch from 7da8ec3 to 343cc78 Compare October 8, 2021 13:19
Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Heersin you forgot 6 more feedback comments, please take a look again. Apart from that, it's good to merge, I think. cc @ret2libc @thestr4ng3r

 * Fix db/cmd/esil tests and anlysis_bf bugs

 * Change BUILD_ header to RZ header

 * Fix macros and types

 * Remove unnecessary string and bit types and unused code

 * Add autolabeler

 * Various file name and function name changes

 * Fix broken test

 * Add unit test for il_definitions

 * Change the index behaviour of bitvector
- Fix some comments

- Add ut8 ut16 convertions
* Rename functions

* Add as_string and remove some unused debug functions

* Add doxygen docs

* Small fix

* Fix header protector to corresponding name

* Add null checks and RZ_NONNULL

* Add more unit tests of rzil_vm, effect and mem

* Minimal migration from index approach tree-like approach

* Fix bug in new rzil op struct
Address some rename comments

Add test for evaluation

Address some old comments

Fix tests
@Heersin Heersin force-pushed the asan_integrate_new_il branch from 343cc78 to 570cffc Compare October 9, 2021 07:21
@XVilka
Copy link
Member

XVilka commented Oct 9, 2021

Two ASAN failed tests are unrelated to this PR and happen on the latest dev as well:

[XX] TIMEOUT db/formats/elf/sections more than 65535 segments
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc iSS~? gzip://bins/elf/core/more-than-65535-segments.gz
-- stdout
--- expected
+++ actual
@@ -1,1 +1,0 @@
-131073

-- stderr
Cannot retrieve regstate on: AMD x86-64 architecture (not yet supported)

-- exit status: -1

[XX] TIMEOUT db/formats/mdmp mdmp 64bit - strings
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc 'iz~0x002e2fe1[1,2,3,4,5,6,7,8]
iz~0x002e2feb[1,2,3,4,5,6,7,8]
iz~0x002e2ff5[1,2,3,4,5,6,7,8]
iz~0x00330d85[1,2,3,4,5,6,7,8]
iz~0x003310c1[1,2,3,4,5,6,7,8]
iz~0x00161a37[1,2,3,4,5,6,7,8]
iz~0x00161c7f[1,2,3,4,5,6,7,8]
iz~0x00161dbf[1,2,3,4,5,6,7,8]
iz~0x001d9d45[1,2,3,4,5,6,7,8]
iz~0x001da0a5[1,2,3,4,5,6,7,8]
iz~0x003fad45[1,2,3,4,5,6,7,8]
iz~0x003fb0ad[1,2,3,4,5,6,7,8]
iz~0x00480857[1,2,3,4,5,6,7,8]
iz~0x004808c7[1,2,3,4,5,6,7,8]
iz~0x00499cf5[1,2,3,4,5,6,7,8]
iz~0x0049a045[1,2,3,4,5,6,7,8]
' bins/mdmp/hello64.dmp
-- stdout
--- expected
+++ actual
@@ -8,9 +8,3 @@
 0x00161dbf 0x7759e130 45 92 C:_Windows_System32_kernel32.dll utf16le \REGISTRY\USER\*\SOFTWARE\Classes\Wow6432Node
 0x001d9d45 0x776160b6 15 32 C:_Windows_System32_kernel32.dll utf16le VS_VERSION_INFO
 0x001da0a5 0x77616416 11 24 C:_Windows_System32_kernel32.dll utf16le VarFileInfo
-0x003fad45 0x7fefd5080b6 15 32 C:_Windows_System32_KERNELBASE.dll utf16le VS_VERSION_INFO
-0x003fb0ad 0x7fefd50841e 11 24 C:_Windows_System32_KERNELBASE.dll utf16le VarFileInfo
-0x00480857 0x7fefdb33bc8 4 10 C:_Windows_System32_msvcrt.dll utf16le PATH
-0x004808c7 0x7fefdb33c38 10 22 C:_Windows_System32_msvcrt.dll utf16le SystemRoot
-0x00499cf5 0x7fefdb4d066 15 32 C:_Windows_System32_msvcrt.dll utf16le VS_VERSION_INFO
-0x0049a045 0x7fefdb4d3b6 11 24 C:_Windows_System32_msvcrt.dll utf16le VarFileInfo

-- stderr
[WARN] Invalid or unsupported enumeration encountered 21
[WARN] Invalid or unsupported enumeration encountered 22
[INFO] Parsing data sections for large dumps can take time, please be patient (but if strings ain't your thing try with -z)!

-- exit status: -1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Redesign of ESIL, or use of alternatives
5 participants