-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slave has Died issue #243
Comments
Hi @5angjun, I think we need to understand why the slaves (or Workers) are dying in the first place ? You can get more logging information with cc @il-steffen , can we expect dying workers during a fuzzing campaign ? Something I'm missing ? |
Worker exit can happen on Qemu segfault or unhandled exception in the worker / mutation logic. The above logic is only to handle the loss of the socket connection, you need to look at why the worker exited.. |
I think the error occured when qemu died. The last died code is this.
So i think it is nice to restart fuzzing campaign when qemu die. |
Please have a look why this is happening. In general we want to fix anything that causes workers to die during a fuzzing campaign. In some cases there may be Qemu segfault that is not easy to fix, for instance we had bugs related to specific virtio fuzzing harnesses where fixing Qemu did not make much sense. In this case it would make sense to catch + restart the worker. This should be possible from the manager, and then the fuzzing campaign can just continue running. The manager main loop is here: https://github.com/IntelLabs/kafl.fuzzer/blob/master/kafl_fuzzer/manager/manager.py#L85 With some luck, the socket connection code you referenced above should detect the new worker and the main loop will start dispatching jobs again. |
This situation appears when allocating a lot of RAM to a vm image and performing parallel fuzzing. In my case, this problem appeared while fuzzing the Windows built-in driver for a long time. For example, my host computer's RAM size is 84G, but when I allocated 10G of RAM to each vm and fuzzed it with 8 cores ( use almost 82G / 84G ), qemu or worker died (there is a high probability that qemu died). But the manager process is still alive. I am thinking about how to modify the code to revive dead workers in the manager process. As a person who loves kAFL, I will also think about how to modify kAFL to make it a masterpiece.😀😀😀 Thank |
Hello, I'm sangjun who is very interested in this project.
However, I want to know how to fix some error in Manager & Workers Communication.
I want to make died process to restart new qemu and reconnect fuzzing process when slave has died.
Dying slaves is very critical when i try to fuzzing very long hours ex) over 6hours.
So i think what needs to be impored is to re-engage dead workers in the fuzzing process.
Is any idea of this??
The text was updated successfully, but these errors were encountered: