Skip to content

Conversation

@oandreeva-nv
Copy link
Contributor

Overview:

Updates CUDA fatbin to work consistently across both ARM64 and x86-64 host architectures

Details:

  • Fatbins compiled without explicit 64-bit mode were not recognized on ARM systems
  • Inconsistent pointer sizes and metadata structure between architectures caused "kernel not found" errors

This version of fatbin was compiled with -m64 flag for consistent 64-bit mode across ARM and x86

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Signed-off-by: Olga Andreeva <oandreeva@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)
  • lib/llm/src/block_manager/block/transfer/kernels/vectorized_copy.fatbin

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@oandreeva-nv oandreeva-nv changed the title fix : Updates CUDA fatbin for KVBM fix: Updates CUDA fatbin for KVBM Dec 2, 2025
@github-actions github-actions bot added the fix label Dec 2, 2025
@oandreeva-nv
Copy link
Contributor Author

/ok to test eadc847

@oandreeva-nv
Copy link
Contributor Author

/ok to test 47f9c54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants