-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port fcd to use Remill #51
Comments
Hi Peter! I'm sort of interested, but I can't really work on this right now. Would you like to try it? At first glance, you probably only need to rip out the contents of translation_context.cpp and put Remill in there. (With that said, there's some degree of possibility that some pattern matching that fcd does to detect flag computations stop working.) |
In addition, it would be really cool to port capstone2llvmir to fcd as well. This way multiple binary to LLVM IR backends could be utilized and compared. Some may support various architectures better than others, etc. From the capstone2llvmir README:
|
You can track the progress of this work here: https://github.com/trailofbits/fcd |
@pgoodman Out of curiosity, why was that repo archived? |
@cryslith We were mostly successful in moving away from Capstone and to using Remill. Ultimately, we discovered that the control-flow restructing algorithms existing in fcd worked in specific situations, but when applied to a wide variety of "weird" code, ended up breaking. To solve these problems, we needed to bring a solver into the mix. Further, fcd implements its own AST data structures, which themselves were incomplete/insufficient for several tasks. We realized that a far more useful tool would generate Clang ASTs. Thus, we started the project Rellic with the narrower scope/focus of being the best system for reversing/converting LLVM IR into C (via Clang ASTs). Beyond this, we can plug in other tools, e.g. Anvill, to get machine code to "nice" LLVM bitcode (given a specification that is similar in spirit to a function prototype). |
Very happy to see Anvill taking shape. It's aim is essentially what I've always envisioned wanting from a binary lifter to LLVM IR. Rellic is also cool, with it's pattern-independent control flow recovery algorithms. |
This is kind of a long-shot, but I think possibly worth it. According to your blog, fcd did not use McSema at the time because it was stuck on LLVM 3.5. Since then, McSema has been re-implemented with the new version 2 and works using on all of LLVM 3.5 through 5.0.
More importantly, though, the actual instruction semantics have been factored out into an independent instruction lifting library, called Remill. Remill supports x86 and x86-64 (with the mmx, x87, sse, and avx instruction sets), as well as aarch64. It is heavily tested, fairly modular, and will be continually supported by Trail of Bits.
If you're interested in this possibility then please let me know!
The text was updated successfully, but these errors were encountered: