Skip to content

Conversation

@loadingalias
Copy link
Contributor

The Problem

The crc-fast library is currently std-only, which prevents its use in embedded systems (bare metal, RTOS), WebAssembly environments, and other no_std contexts where the standard library is unavailable.

This limits adoption in:

  • Embedded devices (ARM Cortex-M, RISC-V microcontrollers)
  • WASM modules for web applications
  • Kernel/bootloader development
  • Safety-critical systems requiring no heap allocation

Additionally, the software fallback implementation had memory leaks detected by Miri, and algorithm.rs contained undefined behavior (potential out-of-bounds read).

The Solution

Add comprehensive no_std compatibility while maintaining 100% backward compatibility with existing std-based code. All existing APIs and functionality remain unchanged when using default features.

Changes

Core Architecture:

  • Added std feature (enabled by default) to preserve existing behavior
  • Added alloc feature for heap allocation in no_std environments
  • Added cache feature for optional caching in no_std (uses hashbrown HashMap)
  • Refactored modules to use core:: and alloc:: primitives instead of std::

Synchronization:

  • Added spin crate for no_std synchronization primitives (Once, RwLock, Mutex)
  • Dual-path feature detection: runtime detection with std, compile-time detection without

Bug Fixes:

  • Fixed UB in algorithm.rs (potential out-of-bounds read on line 74)
  • Fixed Miri-detected memory leaks in software fallback caching
  • All code now passes Miri validation with zero leaks or UB

Testing:

  • Added tests/no_std_tests.rs - 13 comprehensive tests for no_std functionality
  • Added tests/wasm_tests.rs - 13 comprehensive tests for WASM compatibility
  • Added CI jobs for embedded targets: thumbv7em-none-eabihf, thumbv8m.main-none-eabihf, riscv32imac-unknown-none-elf
  • Added CI jobs for WASM targets: wasm32-unknown-unknown, wasm32-wasip1, wasm32-wasip2
  • Added Miri test job to catch UB and memory leaks

Tricky Areas:

  • src/cache.rs (lines 25-150): Dual implementation using either std or spin synchronization
  • src/feature_detection.rs (lines 50-120): Runtime vs compile-time feature detection paths
  • src/algorithm.rs (line 74): Bounds checking fix for the UB issue
  • src/arch/software.rs (lines 200-350): Memory leak fixes in table caching

Planned version bump

  • Which: MINOR
  • Why: Non-breaking new functionality (no_std and WASM support), security fixes (UB and memory leaks), with full backward compatibility. All existing code works unchanged with default features.

Notes

This PR is ready for review. All 193 tests pass locally and in CI across all platforms (x86, x86_64, aarch64, embedded ARM, RISC-V, WASM).

The std feature is enabled by default, so this change is completely transparent to existing users. New users can opt into no_std by specifying default-features = false in their Cargo.toml.

Testing matrix expanded to include:

  • All existing x86/x86_64/aarch64 targets (unchanged)
  • Embedded ARM Cortex-M4F/M7F (thumbv7em)
  • Embedded ARM Cortex-M33/M35P (thumbv8m)
  • Embedded RISC-V 32-bit (riscv32imac)
  • WASM 1.0/2.0 (wasm32-unknown-unknown)
  • WASI preview 1 & 2 (wasm32-wasip1, wasm32-wasip2)
  • Miri validation (zero UB, zero memory leaks)
    *** WASM64 works, but it's not stable in the nightly toolchain yet. Disabled in CI.

Feature combinations tested:

  • no_std alone (minimal, no heap)
  • no_std + alloc (with heap allocation)
  • no_std + cache (with caching enabled)
  • Default (std + alloc + panic-handler)

This commit adds comprehensive no_std compatibility while maintaining 100%
backward compatibility with existing std-based code.

Key changes:
- Add optional `std` feature (enabled by default)
- Add `alloc` and `cache` features for heap allocation in no_std
- Use spin crate for no_std synchronization (Once, RwLock, Mutex)
- Use hashbrown for no_std HashMap when cache feature enabled
- Fix UB in algorithm.rs (potential out-of-bounds read)
- Fix Miri memory leaks in software fallback caching
- Add dual-path feature detection (runtime with std, compile-time without)
- Add comprehensive test coverage (tests/no_std_tests.rs, tests/wasm_tests.rs)
- Add CI workflows for embedded (ARM, RISC-V) and WASM targets
- Update all modules to use core/alloc primitives where appropriate

Tested on:
- Embedded: thumbv7em, thumbv8m, riscv32
- WASM: wasm32-unknown-unknown, wasm32-wasip1, wasm32-wasip2
- All existing x86/x86_64/aarch64 targets
- Miri validation passes (no memory leaks)
@onethumb
Copy link
Contributor

@loadingalias This looks really good, thanks! Reviewing now. Would you mind updating the README with the various features so people can figure out how to build each of the targets? 👍

@onethumb onethumb requested a review from Copilot November 11, 2025 23:08
Copilot finished reviewing on behalf of onethumb November 11, 2025 23:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive no_std and WASM support to the crc-fast library while maintaining full backward compatibility. The changes include:

  • Adding configurable feature flags (std, alloc, cache, ffi, panic-handler)
  • Implementing dual synchronization paths (std vs spin crate for no_std)
  • Enhancing FFI error handling with proper error codes and thread-local error tracking
  • Fixing undefined behavior and memory leaks in algorithm.rs and software fallback

Reviewed Changes

Copilot reviewed 33 out of 36 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/wasm_tests.rs New comprehensive WASM compatibility test suite with 13 tests
tests/no_std_tests.rs New no_std test suite validating core functionality without std
Cargo.toml Added spin and hashbrown dependencies, restructured features for no_std support
src/lib.rs Added no_std support with panic handler, conditional imports for alloc/std
src/ffi.rs Enhanced with error handling system, fallible conversions, better null safety
src/feature_detection.rs Dual-path implementation for runtime (std) vs compile-time (no_std) detection
src/cache.rs Refactored to support both std::sync and spin synchronization primitives
src/algorithm.rs Fixed UB with DataRegion struct for safe overlapping SIMD reads
src/arch/* Updated SIMD intrinsic imports from std::arch to core::arch
.github/workflows/tests.yml Added CI jobs for embedded targets and WASM platforms
Comments suppressed due to low confidence (1)

src/cache.rs:69

  • Fixed trailing whitespace in doc comment.
    /// * `poly` - Polynomial value for the CRC algorithm

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

//! Tests that the library works in WebAssembly. These tests run natively
//! (test framework requires std) but exercise code paths used in WASM.
//!
//! Run tests: cargo test --test wasm_test
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected test file name from 'wasm_test' to 'wasm_tests' in comment.

Copilot uses AI. Check for mistakes.
//! Tests the library works without std. The test framework requires std,
//! but this exercises all no_std code paths.
//!
//! Run tests: cargo test --test real_no_std_test
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected test command from 'real_no_std_test' to 'no_std_tests' in comment.

Copilot uses AI. Check for mistakes.
Comment on lines 769 to 774
const VERSION: &CStr =
match CStr::from_bytes_with_nul(concat!(env!("CARGO_PKG_VERSION"), "\0").as_bytes()) {
Ok(version) => version,
Err(_) => panic!("package version contains null bytes??"),
// Fallback to "unknown" if version string is malformed
Err(_) => c"unknown",
};
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to 'unknown' doesn't set an error code. Consider calling set_last_error(CrcFastError::StringConversionError) in the Err arm for consistency with other functions that document error behavior.

Copilot uses AI. Check for mistakes.
bench = true

[dependencies]
crc = "3"
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When default-features = false is specified, the 'alloc' feature may not be available. Consider removing the features array or documenting why this specific combination is needed, as it could cause build failures if the digest crate's alloc feature depends on its default features.

Suggested change
crc = "3"
crc = "3"
# We use digest with default-features = false and features = ["alloc"] to enable heap allocation support in no_std environments.
# This configuration is safe because the alloc feature in digest does not depend on its default features as of digest v0.10.

Copilot uses AI. Check for mistakes.
Comment on lines 484 to 489
/// Data region descriptor for overlapping SIMD reads in CRC processing
struct DataRegion<'a> {
full_data: &'a [u8],
offset: usize,
remaining: usize,
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DataRegion struct lacks documentation about its invariants. The function documentation mentions requirements (offset >= CRC_CHUNK_SIZE, remaining in 1..=15), but these should also be documented on the struct itself as safety-critical invariants.

Copilot uses AI. Check for mistakes.
- Add comprehensive Features section to README explaining all feature flags
- Document no_std and WASM build instructions with examples
- Add safety invariant documentation to DataRegion struct
- Address PR review feedback for improved documentation clarity
@loadingalias
Copy link
Contributor Author

I've addressed the README.md update and tried to keep it minimal and clear. I also added a couple tiny updates to docs/comments for clarity as per the Copilot overview. Cool feature. Haha.

Let me know if it's good to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants