|
| 1 | +This file contains bugs we find while working on haskell.nix. The format is as |
| 2 | +follow: |
| 3 | +<separator: 80 * '-'> |
| 4 | +YYYY-MM-DD: nix-job name |
| 5 | + |
| 6 | +<error> |
| 7 | + |
| 8 | +<discussion> |
| 9 | + |
| 10 | +-------------------------------------------------------------------------------- |
| 11 | +2024-04-09 x86_64-linux.R2305.ghc8107.mingwW64.ghc |
| 12 | + |
| 13 | +/build/ghc62733_0/ghc_1.s:50:0: error: |
| 14 | + Error: CFI instruction used without previous .cfi_startproc |
| 15 | + | |
| 16 | +50 | .cfi_escape 0x16, 0x07, 0x04, 0x77, 152, 65 |
| 17 | + | ^ |
| 18 | +`x86_64-w64-mingw32-cc' failed in phase `Assembler'. (Exit code: 1) |
| 19 | +make[1]: *** [rts/ghc.mk:325: rts/dist/build/StgCRun.o] Error 1 |
| 20 | + |
| 21 | +The source for this is |
| 22 | +> https://github.com/ghc/ghc/blob/1f02b7430b2fbab403d7ffdde9cfd006e884678e/rts/StgCRun.c#L433 |
| 23 | + |
| 24 | +It appears that GCC C17 12.2.0 does _not_ emit .cfi_startproc / .cfi_endprocs |
| 25 | +whereas GCC C17 13.2.0 _does_. Specificall x86_64-w64-mingw32-cc. So this might |
| 26 | +be a cross compilation issue. |
| 27 | + |
| 28 | +The -g is hardcoded in |
| 29 | +https://github.com/ghc/ghc/blob/1f02b7430b2fbab403d7ffdde9cfd006e884678e/mk/config.mk.in#L361 |
| 30 | + |
| 31 | +Turns out, this was disabled for anything but linux in https://github.com/ghc/ghc/commit/5b08e0c06e038448a63aa9bd7f163b23d824ba4b, |
| 32 | +hence we backport that patch to GHC-8.10 when targeting windows (to prevent mass rebuilds for |
| 33 | +other archs). |
| 34 | + |
| 35 | +-------------------------------------------------------------------------------- |
| 36 | +2024-04-10 x86_64-linux.R2305.ghc902.mingwW64.ghc |
| 37 | + |
| 38 | +make[1]: *** [utils/hsc2hs/ghc.mk:22: utils/hsc2hs/dist-install/build/tmp/hsc2hs.exe] Error 1 |
| 39 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x2a): relocation truncated to fit: R_X86_64_32S against `.text' |
| 40 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x46): relocation truncated to fit: IMAGE_REL_AMD64_ADDR32 against `.data' |
| 41 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x8b): relocation truncated to fit: R_X86_64_32S against symbol `stg_bh_upd_frame_info' defined in .text section in /build/ghc-9.0.2/rts/dist/build/libHSrts.a(Updates.o) |
| 42 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x95): relocation truncated to fit: IMAGE_REL_AMD64_ADDR32 against `.rdata' |
| 43 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0xe3): relocation truncated to fit: R_X86_64_32S against symbol `stg_bh_upd_frame_info' defined in .text section in /build/ghc-9.0.2/rts/dist/build/libHSrts.a(Updates.o) |
| 44 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0xed): relocation truncated to fit: IMAGE_REL_AMD64_ADDR32 against `.rdata' |
| 45 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x13b): relocation truncated to fit: R_X86_64_32S against symbol `stg_bh_upd_frame_info' defined in .text section in /build/ghc-9.0.2/rts/dist/build/libHSrts.a(Updates.o) |
| 46 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x145): relocation truncated to fit: IMAGE_REL_AMD64_ADDR32 against `.rdata' |
| 47 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x193): relocation truncated to fit: R_X86_64_32S against symbol `stg_bh_upd_frame_info' defined in .text section in /build/ghc-9.0.2/rts/dist/build/libHSrts.a(Updates.o) |
| 48 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x19d): relocation truncated to fit: IMAGE_REL_AMD64_ADDR32 against `.rdata' |
| 49 | +utils/runghc/dist-install/build/Main.o:fake:(.text+0x1eb): additional relocation overflows omitted from the output |
| 50 | + |
| 51 | +We notice `fake`, which is GHC failing to provide .file identifier in the source. |
| 52 | +We also see lots of R_X64_64_32S relocations, which are signed 32bit relocations. |
| 53 | +These fall with ASLR and high entropy base images from later binutils. |
| 54 | + |
| 55 | +The underlying issue is that GHC emits _absolute_ label loads (mov $... reg), instead |
| 56 | +of %rpi or other relative loads. This then leads to the linker emitting 32bit |
| 57 | +absolute relocation. With the final image being potentially loaded into high memory |
| 58 | +(e.g. dynamic base, and the base image being set to some high address), the linker |
| 59 | +starts falling over itself, because it simply can't resolve those absolute addresses |
| 60 | +in the 32bit slots. |
| 61 | + |
| 62 | +This was fixed in GHC upstream in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7449, |
| 63 | +while the patch in haskell.nix is a bit more pedestrian and just sets PIC on windows to |
| 64 | +always be on, and then uses the PIC pipeline. |
| 65 | + |
| 66 | +-------------------------------------------------------------------------------- |
| 67 | +2024-06-18 x86_64-linux.unstable.ghc9101.ucrt64.tests.th-dlls-minimal.build |
| 68 | + |
| 69 | +0024:err:seh:call_stack_handlers invalid frame 00007FFFFF68EF18 (0000000000022000-0000000000220000) |
| 70 | +0024:err:seh:NtRaiseException Exception frame is not in stack limits => unable to dispatch exception. |
| 71 | +iserv-proxy: {handle: <socket: 11>}: GHCi.Message.remoteCall: end of file |
| 72 | + |
| 73 | +This is due to GHC mislinking GNU import libraries (dll.a). What happens is that |
| 74 | +GHC ends up creating GOT entries for function calls instead of PLT entries. The |
| 75 | +loader/linker in GHC for Windows has logic to lazy load .dll's as referenced. For |
| 76 | +this symbols get a dependency symbol attached, this could be a symbol indicating |
| 77 | +the DLL that needs to be loaded. While walking the dependencies to find the dll to |
| 78 | +load (or in some cases just the dependent symbol, not the dll), we override the |
| 79 | +symbol type with the one of the dependent symbol. This however means we'll |
| 80 | +override the type of a symbol with the DATA type each time the symbol leads to a |
| 81 | +dllInstance to be loaded. Subsequently we end up creating a GOT entry instead of |
| 82 | +a PLT entry for the symbol, irrepsective of the original symbol being a code or |
| 83 | +data symbol. If code symbols end up getting GOT stubs, we see the above crash as |
| 84 | +the control flow jumps to the location of the stub, and instead of a PLT/jump |
| 85 | +island just lands in the address of the target symbol, which is in most cases |
| 86 | +non-sensical machine code. |
0 commit comments