-
Notifications
You must be signed in to change notification settings - Fork 90
Pull requests: jd-opensource/xllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor: separate mlu and cuda version Qwen model implementation.
#468
opened Dec 1, 2025 by
XuZhang99
Loading…
feat: determine device type automatically except npu device.
#467
opened Dec 1, 2025 by
XuZhang99
Loading…
feat: add NPU process group initialization and management.
#456
opened Nov 29, 2025 by
yingxudeng
Loading…
feat: implement chunked prefill and prefix cache for Qwen3 MoE.
#455
opened Nov 29, 2025 by
yingxudeng
Loading…
refactor: optimize unique token count preparation of batch input builder.
#449
opened Nov 27, 2025 by
RobbieLeung
Loading…
refactor: move draft input preparation of decode batch from worker to batch builder.
#448
opened Nov 27, 2025 by
RobbieLeung
Loading…
[WIP] feat: support loading model weights and forward overlap.
#441
opened Nov 26, 2025 by
Clement-Wang26
Loading…
feat: support Qwen2-VL & GME-Qwen2-VL model on npu device.
#399
opened Nov 18, 2025 by
xanecdotex
Loading…
feat: enable torch_npu graph mode for Qwen-3 dense with TP support.
#325
opened Nov 6, 2025 by
yingxudeng
Loading…
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.