-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support relocation kind 0003 and extend IMAGE_REL_ types #141
Conversation
Specifically adds support for: - IMAGE_REL_AMD64_ADDR32NB. (This is the "relocation kind 0003"). - IMAGE_REL_I386_DIR32NB. (This is the x86 version of the above). - IMAGE_REL_AMD64_REL32_5. (This is documented but weirdly not implemented).
flexdll.c
Outdated
err->code = 3; | ||
goto restore; | ||
} | ||
*((UINT32*) ptr->addr) = s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own understanding:
- In LLVM, this relocation is implemented by
add32(off, s)
, so thatoff
in LLVM corresponds toptr->addr
in Flexlink ands
in LLVM corresponds tos
in Flexlink. - The relocation
IMAGE_REL_AMD64_ADDR64
in LLVM (RELOC_ABS
in Flexlink) is implemented byadd64(off, s + imageBase)
, so thats + imageBase
in LLVM corresponds tos
in Flexlink.
Do you understand why the s
arguments do not seem to match between the two cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going one step further, those equations imply that imageBase = 0
.
I spent a lot of time looking and I couldn't find out why IMAGE_REL_AMD64_ADDR64 seems to work. The math for all the other relocation kinds make sense.
However, two things ...
- The
/base:
base addresslink.exe
option is fixed for OCaml executable to be0x10000
atLine 47 in 80496b5
let base_addr = ref "0x10000"
and also the same in
Line 77 in 80496b5
let image_base = 0x10000l in
for OCaml plugins / stubs. - All of the relative relocations from this section of code are translated by:
Lines 470 to 475 in 80496b5
Reloc.abs !machine sect (Int32.of_int (Buffer.length data)) strsym; int_to_buf data pos; Reloc.abs !machine sect (Int32.of_int (Buffer.length data)) (Lazy.force secsym); int_to_buf data (Int32.to_int rel.addr);
into absolute relocations byReloc.abs
:Lines 437 to 444 in 80496b5
module Reloc = struct let abs machine sec addr sym = let rtype = match machine with | `x86 -> 0x06 | `x64 -> 0x01 in sec.relocs <- { addr = addr; symbol = sym; rtype = rtype } :: sec.relocs
So my suspicion was that the Reloc.abs
absolute relocations were being translated at CreateProcess time by ntdll.LdrInitializeThunk
(or whatever is reading the PE .reloc
section) to complete the imageBase
adjustment.
@dra27, any chance you can explain why base_addr
(or image_base
for DLLs) is not used in the calculations?
(These calculations are undocumented in flexdll and fairly complicated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's far too long since I stared at all this, but are we comparing like with like here? Isn't that code in LLVM the part of a linker which is computing actual relocations to be carried out by the loader? In flexdll's, case, I don't think we have to worry about the image base - the symbol table as a data structure has already been had all the relocations resolved when the executable was loaded (by the Windows loader) - so s
in flexdll is an absolute address for the symbol, where in lld it's still base-relative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your loader comment is what I had meant with "translated at CreateProcess time by ntdll.LdrInitializeThunk
". But I am still puzzled why that only applies to IMAGE_REL_AMD64_ADDR64.
I just noticed that I have not extended the DLL logic at Lines 235 to 253 in 80496b5
That DLL logic does use the I don't really have a way to test that. The stubs/plugins generated by OCaml do not have those relocations (which is also why I didn't notice them). |
387e4b8
to
19aadf0
Compare
Co-authored-by: Antonin Décimo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me - at least, it sounds as though it's reasonably clear from your testing that the RELOC_32NB
implementation is working, and the RELOC_REL32_5
implementation is clearly correct w.r.t. the others already there.
Do you mind adding the last missing RELOC_REL32_3
while we here, and I think this can go straight in.
FIXED. This is an independent problem.
etc. @MisterDA: Looks like casting of int64/uint64 to uint32 is not done correctly ... I think it was just luck before if it was working. Your Avoid conversion from 'INT_PTR' to 'UINT32' warnings review suggestion is very close to where I'm seeing the failures, so it sounds like the C compiler was giving us a good warning. If you know how to make those INT_PTR/INT32/UINT32 operations robust, reply back! I'll investigate more in the meantime. EDIT: After looking at old emails, it sounds like the checks are to make sure that the relocations are within 2GB, which presumes all the DLLs are within 2GB (here and here). And that begs the question what magic mechanism is enforcing that 2GB bin-packing! I wonder if there is a MSVC option to force all these problematic address references to be 64-bit on 64-bit machines. EDIT 2: Changing |
Also reorder match (OCaml) and case (C) for clarity and to ensure nothing is missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit with the (very welcome) readme update, but otherwise good to go, thanks!
Indeed, the docs in flexdll are out of date w.r.t. Except that we lost Cygwin64 support in 2018 with the release of Cygwin's IIRC mingw-w64 had never clamped the base address because it hadn't been necessary, but that started to become necessary with binutils 2.36 - at which point (ocaml/ocaml#10351) co-opted the same mechanism for mingw-w64. Hwoever, msvc64 still uses the base address trick. |
Are those failures with msvc64 or mingw-w64, @jonahbeckford? IIRC opam-repository-mingw "fixed" the binutils thing by forcing the base address for mingw-w64 instead, which means that this will be becoming a gradually bigger problem, requiring more uses of The systhreads failure is surprising, as I thought that was only in 5.1+ (and is fixed in 5.2.1 and 5.3.0) - which version of OCaml is that one with? |
Co-authored-by: David Allsopp <[email protected]>
msvc64
4.14.2 with the nine (9) ocaml-common-4_14-a* patches at https://github.com/diskuv/dkml-compiler/tree/main/src/p and two (2) non-DkML patches (see below) using the ucrt branch of flexdll (ucrt link in this PR's main comment). ocaml-common-4_14-z02: This is relevant to systhreads. |
Thanks for the context! I feel at some point I might explore getting rid of flexdll entirely. There is a lot of complexity in flexdll for a a) use case centered around resolving circular references for plugins which I think was not a great design, and b) to remove the need for Line 65 in a003c18
Basically, I think today there is little benefit and a lot of drawbacks where 10-20 years ago the balance was a lot more positive. But OCaml is deeply tied to the circular reference model for stubs and also (maybe related) deeply tied to a flat symbol namespace ... the two assumptions from ELF linking where DLL and dylib linking have better designs ... so I'm under no illusion that untangling the assumptions would be easy. |
Fixes #29 .
Specifically adds support for:
IMAGE_REL_AMD64_ADDR32NB. (This is the "relocation kind 0003").
IMAGE_REL_I386_DIR32NB. (This is the x86 version of the above).
IMAGE_REL_AMD64_REL32_5. (This is documented but weirdly not implemented).
Tested with my ucrt branch https://github.com/jonahbeckford/flexdll/tree/0.43%2Bucrt where relocation kind 0003 occurs often. It especially occurs in "normal" C libraries (ucvrt;
/MD
) that are linked into an OCaml executable. None of the tested code used/GS-
.References: