Skip to content

Conversation

@jserv
Copy link
Contributor

@jserv jserv commented Oct 5, 2025

ThreadSanitizer (TSAN) can now detect race conditions across the entire multi-threaded JIT pipeline with full 4GB address space emulation. This enables testing of the tier-2 LLVM compilation thread while maintaining production memory layout.

Memory Layout (TSAN-compatible):

  • Main memory: MAP_FIXED at 0x7d0000000000 (4GB)
  • JIT buffer: MAP_FIXED at 0x7d1000000000
  • Both allocations within TSAN app range (0x7cf-0x7ff trillion)
  • Prevents conflicts with TSAN shadow memory (0x02a-0x7ce trillion)

ASLR Mitigation:

  • Added setarch -R wrapper for TSAN test execution
  • Disables ASLR to prevent random allocations in shadow memory
  • Only affects test runs, not production builds

SDL Conflict Resolution:

  • SDL (uninstrumented system library) creates threads TSAN cannot track
  • Disabled SDL when TSAN enabled to focus on built-in race detection
  • Production builds still fully support SDL

Summary by cubic

Enables ThreadSanitizer across the multi-threaded JIT pipeline (including T2C) with FULL4G emulation on x86-64 and ARM64, and adds JIT debug checks to catch regalloc and cache coherency issues early.

  • New Features

    • TSAN mode: -fsanitize=thread; fixed FULL4G/JIT mappings (x86_64: 0x7d0000000000/0x7d1000000000, arm64: 0x150000000000/0x151000000000 + MAP_JIT); tests run via setarch -R; SDL and LTO disabled only under TSAN; quieter TSAN defaults.
    • JIT debug mode (ENABLE_JIT_DEBUG=1) with regmap conflict checks and cache coherency verification; wired into CI and .ci/jit-debug-test.sh.
  • Bug Fixes

    • T2C races: atomic load/store for hot2/compiled/n_invoke; enqueue uses cache key (PC|SATP), condition variable for wakeups, SATP verification; proper signaling on enqueue and shutdown.
    • JIT stability: fixed icache coherency (write-protect ordering and invalidation) and regalloc conflicts in load ops; only execute translated blocks on success.
    • User mode: auto-enable ELF_LOADER when SYSTEM=0 (except arch tests) to fix failing runs.
    • Build/CI: remove unconditional DTB deps to avoid dtc errors; ensure DTB is built for SYSTEM builds (including emscripten and macOS/arm64) and filter DTB files from linker; add apt-get update retries on arm64 runners; fix emcc builds by running plain distclean to clear stale .config.

Written for commit 874ba07. Summary will update automatically on new commits.

Copy link
Contributor Author

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks

Benchmark suite Current: 874ba07 Previous: 7711e50 Ratio
Dhrystone 1315 Average DMIPS over 10 runs 1290 Average DMIPS over 10 runs 0.98
Coremark 902.988 Average iterations/sec over 10 runs 915.002 Average iterations/sec over 10 runs 1.01

This comment was automatically generated by workflow using github-action-benchmark.

cubic-dev-ai[bot]

This comment was marked as resolved.

@jserv jserv changed the title Enable TSAN with FULL4G and T2C support Enable ThreadSanitizer across the entire multi-threaded JIT pipeline Oct 5, 2025
@jserv jserv force-pushed the fix-regressions branch 8 times, most recently from 13b3ac5 to 8234e97 Compare October 8, 2025 13:27
jserv added 2 commits November 8, 2025 07:36
ThreadSanitizer (TSAN) can now detect race conditions across the entire
multi-threaded JIT pipeline with full 4GB address space emulation. This
enables testing of the tier-2 LLVM compilation thread while maintaining
production memory layout.

Memory Layout (TSAN-compatible):
- Main memory: MAP_FIXED at 0x7d0000000000 (4GB)
- JIT buffer: MAP_FIXED at 0x7d1000000000
- Both allocations within TSAN app range (0x7cf-0x7ff trillion)
- Prevents conflicts with TSAN shadow memory (0x02a-0x7ce trillion)

ASLR Mitigation:
- Added setarch -R wrapper for TSAN test execution
- Disables ASLR to prevent random allocations in shadow memory
- Only affects test runs, not production builds

SDL Conflict Resolution:
- SDL (uninstrumented system library) creates threads TSAN cannot track
- Disabled SDL when TSAN enabled to focus on built-in race detection
- Production builds still fully support SDL
This commit adds ThreadSanitizer (TSAN) support for ARM64/Apple Silicon
and fixes critical JIT instruction cache coherency issues.

ARM64 TSAN Support:
- Extended TSAN-compatible memory allocation to ARM64 architecture
- Main memory allocated at fixed address 0x150000000000 (21TB)
- JIT buffer allocated at 0x151000000000 with MAP_JIT for Apple Silicon
- Both allocations avoid TSAN shadow memory and enable race detection
- Note: Requires ASLR disabled on macOS (SIP restrictions may apply)

JIT Cache Coherency Fixes:
1. Fixed pthread_jit_write_protect_np() ordering in update_branch_imm
2. Added sys_icache_invalidate() in update_branch_imm
3. Added cache invalidation in resolve_jumps() for x86_64

Fix JIT regalloc conflicts in memory load

After reset_reg() clears the register allocator state, load instructions
(lb/lh/lw/lbu/lhu) could reallocate the same host register for both the
address and destination, causing data corruption. This commit uses
map_vm_reg_reserved() to prevent reusing the address register.
@jserv jserv force-pushed the fix-regressions branch 4 times, most recently from b7da692 to 13ecaff Compare November 8, 2025 01:16
jserv added 2 commits November 8, 2025 09:19
This commit introduces a comprehensive JIT debugging infrastructure to
catch register allocation conflicts and cache coherency issues before
they cause subtle runtime failures in production.
User-space emulation tests were failing because ENABLE_ELF_LOADER
defaulted to 0, preventing ELF file loading. The fix automatically
enables ELF_LOADER when SYSTEM=0, as user-space mode always requires
it to load test ELF files.
@jserv jserv force-pushed the fix-regressions branch 7 times, most recently from 75f6782 to 7711e50 Compare November 9, 2025 03:32
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Nov 9, 2025
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Nov 9, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 13 files

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name=".github/workflows/main.yml">

<violation number="1" location=".github/workflows/main.yml:308">
Piping `apt update` through `tee` hides the command’s non-zero exit status, so cases like signature failures or apt lock conflicts skip retries and we continue with a broken package index. Please preserve the real exit code (e.g., enable `set -o pipefail` or capture `${PIPESTATUS[0]}` after the pipeline) before deciding to break.</violation>

<violation number="2" location=".github/workflows/main.yml:310">
The failure filter ignores `InRelease`, so an `apt update` error like `Failed to fetch … InRelease` is treated as success and we skip the remaining retries. Please include `InRelease` in the pattern so we keep retrying on those critical errors.</violation>
</file>

<file name="src/main.c">

<violation number="1" location="src/main.c:31">
The new TSAN options block is guarded by __SANITIZE_THREAD__ only, so clang TSAN builds skip __tsan_default_options entirely and never apply the intended runtime configuration.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 13 files

Prompt for AI agents (all 2 issues)

Understand the root cause of the following 2 issues and fix them.


<file name="Makefile">

<violation number="1" location="Makefile:109">
`setarch` is Linux-only; with TSAN enabled on macOS the test recipe expands to `setarch $(uname -m) -R …` and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.</violation>
</file>

<file name=".github/workflows/main.yml">

<violation number="1" location=".github/workflows/main.yml:310">
`apt update` reports release failures as `Failed to fetch …/InRelease`, but this regex only looks for Packages/Sources/Release. As a result the retry loop treats an `InRelease` failure as success and never retries, undermining the fix and leaving the job without refreshed indices. Please include `InRelease` in the match so real release failures still trigger retries.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation
CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations
# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
BIN_WRAPPER = setarch $(shell uname -m) -R
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setarch is Linux-only; with TSAN enabled on macOS the test recipe expands to setarch $(uname -m) -R … and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.

Prompt for AI agents
Address the following comment on Makefile at line 109:

<comment>`setarch` is Linux-only; with TSAN enabled on macOS the test recipe expands to `setarch $(uname -m) -R …` and fails because the binary is missing. Please gate the wrapper by host OS or provide a macOS-safe alternative so TSAN checks remain usable cross-platform.</comment>

<file context>
@@ -80,6 +73,44 @@ endif
+override ENABLE_LTO := 0       # LTO interferes with TSAN instrumentation
+CFLAGS += -DTSAN_ENABLED       # Signal code to use TSAN-compatible allocations
+# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
+BIN_WRAPPER = setarch $(shell uname -m) -R
+else
+BIN_WRAPPER =
</file context>
Fix with Cubic

@jserv
Copy link
Contributor Author

jserv commented Nov 9, 2025

@cubic-dev-ai The review finding about MAP_JIT is incorrect.

Verification: The code at src/jit.c:2473-2477 (x86_64 TSAN path) already includes MAP_JIT:

state->buf = mmap(jit_addr, size, PROT_READ | PROT_WRITE | PROT_EXEC,
                  MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED
#if defined(__APPLE__)
                      | MAP_JIT  // ← Present on line 2476
#endif
                  ,
                  -1, 0);

All three mmap paths include MAP_JIT on macOS:

  1. x86_64 TSAN path (line 2476): ✅ Has MAP_JIT
  2. ARM64 TSAN path (line 2492): ✅ Has MAP_JIT
  3. Standard path (line 2506): ✅ Has MAP_JIT

The MAP_JIT flag was added in commit f1b685e ("Add Arm64 TSAN support and fix JIT cache coherency") and is present in the current PR.

Please re-scan or verify against the actual code at commit 7711e50.

@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Nov 9, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 13 files

Prompt for AI agents (all 2 issues)

Understand the root cause of the following 2 issues and fix them.


<file name="src/main.c">

<violation number="1" location="src/main.c:307">
Raising the default log level to LOG_INFO suppresses the rv_log_trace() stream that powers the -t instruction trace (rv_log_set_level(attr-&gt;log_level) now filters those messages), so trace mode stops emitting output.</violation>
</file>

<file name="Makefile">

<violation number="1" location="Makefile:109">
TSAN runs now always prepend `setarch`, but macOS does not ship this binary; enabling TSAN on mac hosts will make `make check` fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when `setarch` is unavailable.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

override ENABLE_LTO := 0 # LTO interferes with TSAN instrumentation
CFLAGS += -DTSAN_ENABLED # Signal code to use TSAN-compatible allocations
# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
BIN_WRAPPER = setarch $(shell uname -m) -R
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TSAN runs now always prepend setarch, but macOS does not ship this binary; enabling TSAN on mac hosts will make make check fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when setarch is unavailable.

Prompt for AI agents
Address the following comment on Makefile at line 109:

<comment>TSAN runs now always prepend `setarch`, but macOS does not ship this binary; enabling TSAN on mac hosts will make `make check` fail before the emulator starts. Please gate the wrapper to Linux or gracefully skip it when `setarch` is unavailable.</comment>

<file context>
@@ -80,6 +73,44 @@ endif
+override ENABLE_LTO := 0       # LTO interferes with TSAN instrumentation
+CFLAGS += -DTSAN_ENABLED       # Signal code to use TSAN-compatible allocations
+# Disable ASLR for TSAN tests to prevent allocations in TSAN shadow memory
+BIN_WRAPPER = setarch $(shell uname -m) -R
+else
+BIN_WRAPPER =
</file context>
Suggested change
BIN_WRAPPER = setarch $(shell uname -m) -R
BIN_WRAPPER = $(if $(filter Linux,$(UNAME_S)),setarch $(shell uname -m) -R,)
Fix with Cubic

This commit addresses multiple regressions introduced in recent changes:
1. DTB compilation regression
   - DTB dependencies moved outside CC_IS_EMCC conditional
   - Ensures DTB builds for system mode regardless of compiler
   - Fixes mk/wasm.mk structure for cross-platform consistency
2. Makefile syntax error in mk/toolchain.mk
   - Fixed TAB characters before $(warning) on lines 25, 28
   - Changed to spaces for proper control flow
   - This was blocking all Makefile parsing
3. emcc configuration pollution
   - Added 'make distclean' before emcc builds in workflow
   - Prevents ENABLE_SYSTEM=1 from leaking between builds
   - Fixes "build/minimal.dtb does not exist" errors
4. Ubuntu ARM64 apt-get failures
   - Implemented exponential backoff retry mechanism (30s, 60s delays)
   - Added pipefail to preserve apt exit codes through tee
   - Explicit APT_EXIT capture to detect masked failures
   - Added InRelease to failure pattern (modern combined Release+GPG)
   - Ignore non-critical dep11 metadata failures
   - Focus on core package indices (Packages/Sources/Release/InRelease)
5. TSAN cross-compiler compatibility
   - Added clang TSAN detection via __has_feature(thread_sanitizer)
   - Previously only GCC's __SANITIZE_THREAD__ was checked
   - Ensures __tsan_default_options() works with both compilers
6. TSAN cross-platform compatibility
   - Guarded setarch with ifeq ($(UNAME_S),Linux) in Makefile
   - setarch doesn't exist on macOS, now conditionally applied
   - macOS TSAN builds require SIP disabled for ASLR control
7. Trace functionality regression
   - Reverted .log_level from LOG_INFO back to LOG_TRACE
   - LOG_INFO suppressed rv_log_trace() stream used by -t flag
   - Restores instruction trace output for debugging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants