Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrarily high hit counts / high allocation counts hurts search performance, refactor search hits to use rz_vector? #1909

Open
Andersama opened this issue Oct 31, 2021 · 5 comments · May be fixed by #4742
Labels
good first issue Good for newcomers optimization refactor Refactoring requests RzSearch
Milestone

Comments

@Andersama
Copy link

Andersama commented Oct 31, 2021

Is your feature request related to a problem? Please describe.
May be more related to cutter's gui because it looks like the api has a maximum search limit / count and perhaps that'd solve the issue in commandline. However, without knowing that in advance you can hang the application by running a search which may have too many results. The issue stems from usage of a list structure which allocates for every saved result in rz_search_t.

The problem function in question would be rz_search_hit_new.

Edit: on second look it appears the api is being used with a callback which terminates early and skips the allocation I thought was the original issue. This might impact uses of this part of the api without a provided callback, they're likely slower than it should be. Whatever's causing the application hang eludes me.

Describe the solution you'd like
Key part here would be to refactor the search functionality to use a vector like (or probably better rope like) structure. This should improve search timings considerably since time won't be wasted waiting on the system for memory. There appears to be a header for vectors in the library, it probably could be put to use here. A well designed rope structure would probably be better.

EG: instead of

typedef struct rz_search_t {
	int n_kws; // hit${n_kws}_${count}
	int mode;
	ut32 pattern_size;
	ut32 string_min; // max length of strings for RZ_SEARCH_STRING
	ut32 string_max; // min length of strings for RZ_SEARCH_STRING
	void *data; // data used by search algorithm
	void *user; // user data passed to callback
	RzSearchCallback callback;
	ut64 nhits;
	ut64 maxhits; // search.maxhits
	RzList *hits;
	int distance;
	int inverse;
	bool overlap; // whether two matches can overlap
	int contiguous;
	int align;
	int (*update)(struct rz_search_t *s, ut64 from, const ut8 *buf, int len);
	RzList *kws; // TODO: Use rz_search_kw_new ()
	RzIOBind iob;
	char bckwrds;
} RzSearch;

something like:

typedef struct rz_search_t {
	int n_kws; // hit${n_kws}_${count}
	int mode;
	ut32 pattern_size;
	ut32 string_min; // max length of strings for RZ_SEARCH_STRING
	ut32 string_max; // min length of strings for RZ_SEARCH_STRING
	void *data; // data used by search algorithm
	void *user; // user data passed to callback
	RzSearchCallback callback;
	ut64 nhits;
	ut64 maxhits; // search.maxhits
	RzVector *hits; // this probably doesn't need to be a pointer? also nhits would be hits->len
	int distance;
	int inverse;
	bool overlap; // whether two matches can overlap
	int contiguous;
	int align;
	int (*update)(struct rz_search_t *s, ut64 from, const ut8 *buf, int len);
	RzList *kws; // TODO: Use rz_search_kw_new ()
	RzIOBind iob;
	char bckwrds;
} RzSearch;

Since results are addresses there's no need to allocate for every 8 byte address, and since any information related to pointers is auxiliary to the pointer there's and pointers are by definition unique there's plenty of opportunity to use hashtables to grab any interesting auxiliary information about the pointer later. It seems like in addition to the matched address currently a pointer to the search used is saved. There's probably a good amount of memory to save by not storing that pointer for each hit.

Describe alternatives you've considered
Another idea would be a custom allocator, seems like that would considerably more difficult, and I suspect a better data structure is the better approach.

Additional context
A fairly easy way to test this is to do what I did by accident, which is to debug a program and search for 0 effectively everywhere.

It also looks as if the maxhits variable can be used to reserve space in rz_search_begin. This in theory should reduce allocations wherever this api is used.

@ret2libc
Copy link
Member

ret2libc commented Nov 3, 2021

Hi! Thanks for your report.

Could you be a bit clearer on what is the problem before discussing possible solutions?
What are you doing? What is the version of Rizin you are using? Which OS? Which binary (if possible to share)?

From the additional context section I'm trying doing (Fedora 34, x86_64, rizin v0.3.0)

$ rizin -d /bin/ls
> /x 00
Searching 1 byte in [0x55f9c8333000-0x55f9c8347000]
hits: 15413
Do you want to print 15413 lines? (y/N)
[...]

And this happens very quicky. Did I understand it right? Are you doing anything different?
Thanks again

@Andersama
Copy link
Author

I built off of whatever's included by default for building cutter on it's default branch. I'm on windows x64 bit. From memory I ran the command to search for 0's by accident across all memory maps when I was attached to inscryption.exe, there's a demo on steam which I can only imagine would do the same thing. I don't have complete context for what rizin would've done in command line, I essentially froze cutter.

@ret2libc
Copy link
Member

We would need few more details to start looking at this. I feel this is more of a "bug report" rather than a "feature request"

We need to understand, at least:

  • OS
  • rizin full version
  • cutter version (taken from where: appimage, exe, etc.)
  • binary format (or even better if binary can be shared)
  • how did you trigger the bug exactly
  • any other bits of info useful to reproduce the bug.

Without the above info there is not much we can do to help you, sorry.

@Andersama
Copy link
Author

Andersama commented Nov 18, 2021

Sure ok, I guess you're asking to be more specific:
Windows 10 Home: Version 10.0.19042 Build 19042 (x64)
image
image

Here's what I did, or I believe there might be an extra option when debugging a program for "All memory maps", if there's a command which goes byte by byte for 0s you might want to try that instead.
image
I wouldn't necessarily call it a bug, it was more or less just that the results took so long the application went unresponsive, and from memory although it's been a few weeks now it might be the work done to output to console for cutter that takes so long.

The binary I was analyzing was the inscryption game available on steam. There's been an update since, so I don't think I could get you an exact binary to look at, but I'm fairly confident the demo would do a similar thing if you really need it. I'd try making an executable which just loads a lot of 0's into memory and overall check and/or stress test the performance of the search widget when there's a lot of matches.

@XVilka XVilka added the refactor Refactoring requests label Mar 17, 2022
@XVilka XVilka added this to the 0.6.0 milestone Mar 3, 2023
@XVilka XVilka modified the milestones: 0.6.0, 0.7.0 Jun 29, 2023
@XVilka XVilka modified the milestones: 0.7.0, 0.8.0 Feb 16, 2024
@XVilka
Copy link
Member

XVilka commented Nov 29, 2024

Will be solved by #4742

@Rot127 Rot127 linked a pull request Nov 29, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers optimization refactor Refactoring requests RzSearch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants