Add ability to select backend devices #861

kusaanko · 2025-11-11T08:28:48Z

Now, LlamaModelParams has a field main_gpu but it is not utilized because we can't set split_mode to LLAMA_SPLIT_MODE_NONE.

This PR makes split_mode settable.
Moreover, this PR allows you to specify which backend devices to use in safe Rust.

The GGML backend device indices are not modified until you unload ggml backends, so the indices are used to specify devices.

llama-cpp-2/src/model/params.rs

llama-cpp-2/src/lib.rs

MarcusDunn

great addition, two small changes I would like to see before merging though.

Replaces the From<i32> implementation for LlamaSplitMode with TryFrom<i32>, returning a custom LlamaSplitModeParseError on invalid values. Updates LlamaModelParams::split_mode to return a Result, improving error handling for unknown split modes.

llama-cpp-2/src/model/params.rs

MarcusDunn

left a suggestion to clean up docs. Rest looks good.

MarcusDunn · 2025-11-12T19:36:16Z

I'll need https://github.com/utilityai/llama-cpp-rs/actions/runs/19287713535/job/55222838090?pr=861 to pass, then I'm good to merge.

Replaces enum variant values for LlamaSplitMode with explicit integers (0, 1, 2) instead of referencing llama_cpp_sys_2 constants. Adds TryFrom<u32> implementation for LlamaSplitMode to improve type conversion and error handling. Updates documentation and error types for clarity.

kusaanko · 2025-11-13T08:18:58Z

I actually wanted to use LLAMA constants, but since they seem to change between i32 and u32 depending on the platform, I decided to use the values directly instead.

MarcusDunn · 2025-11-13T17:24:57Z

you can use llama_cpp_sys_2::LLAMA_SPLIT_MODE_NONE as _ to cast to the correct platform type, you can implement TryFrom for both i32 and u32 as well.

Replaces hardcoded integer values in the LlamaSplitMode enum and its conversions with corresponding constants from llama_cpp_sys_2. Adds From<LlamaSplitMode> for u32 and updates TryFrom implementations for better consistency with external definitions.

Introduces constants for split mode values with explicit clippy lint allowances and updates the LlamaSplitMode enum to use these constants. Refactors TryFrom implementations for i32 and u32 to improve error handling and type safety, and makes LlamaSplitModeParseError's field public.

Cleaned up the LlamaSplitMode enum and related conversions by removing redundant clippy allow attributes. Also simplified some type conversions for clarity.

kusaanko · 2025-11-13T18:07:23Z

Cast is not allowed in match statement, and I can't use it in if statement because rustc can't infer the type.

The way generates warnings by clippy, so I decided to introduce constants in params.rs to convert to i8 at compile time.

Update the doc test for LlamaModelParams to assert that split_mode returns Ok(LlamaSplitMode::Layer) instead of just LlamaSplitMode::Layer, reflecting the actual return type.

kusaanko added 3 commits November 11, 2025 17:21

Add ability to select backend devices

a138498

Add device type to LlamaBackendDevice

f93eb1b

Rename LlamaBackendDeviceType enums to PascalCase

750ed12

MarcusDunn reviewed Nov 11, 2025

View reviewed changes

llama-cpp-2/src/model/params.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Nov 11, 2025

View reviewed changes

llama-cpp-2/src/lib.rs Outdated Show resolved Hide resolved

MarcusDunn requested changes Nov 11, 2025

View reviewed changes

kusaanko added 4 commits November 12, 2025 14:29

Fix invalid device type convertion to Gpu

7d12f30

Split unsafe block into small parts in list_llama_ggml_backend_devices

99ad808

Implement Default for LlamaSplitMode

3468ae7

MarcusDunn reviewed Nov 12, 2025

View reviewed changes

llama-cpp-2/src/model/params.rs Outdated Show resolved Hide resolved

MarcusDunn approved these changes Nov 12, 2025

View reviewed changes

kusaanko requested a review from MarcusDunn November 13, 2025 08:19

MarcusDunn approved these changes Nov 13, 2025

View reviewed changes

kusaanko added 3 commits November 14, 2025 02:50

Use LLAMA_SPLIT_MODE constants defined in params.rs

86e1fc5

Cleaned up the LlamaSplitMode enum and related conversions by removing redundant clippy allow attributes. Also simplified some type conversions for clarity.

Fix split_mode assert

86b1996

Update the doc test for LlamaModelParams to assert that split_mode returns Ok(LlamaSplitMode::Layer) instead of just LlamaSplitMode::Layer, reflecting the actual return type.

kusaanko requested a review from MarcusDunn November 13, 2025 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ability to select backend devices #861

Add ability to select backend devices #861

kusaanko commented Nov 11, 2025

Uh oh!

Uh oh!

Uh oh!

MarcusDunn left a comment

Uh oh!

Uh oh!

MarcusDunn left a comment

Uh oh!

MarcusDunn commented Nov 12, 2025

Uh oh!

kusaanko commented Nov 13, 2025

Uh oh!

MarcusDunn commented Nov 13, 2025 •

edited

Loading

Uh oh!

kusaanko commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add ability to select backend devices #861

Are you sure you want to change the base?

Add ability to select backend devices #861

Conversation

kusaanko commented Nov 11, 2025

Uh oh!

Uh oh!

Uh oh!

MarcusDunn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MarcusDunn left a comment

Choose a reason for hiding this comment

Uh oh!

MarcusDunn commented Nov 12, 2025

Uh oh!

kusaanko commented Nov 13, 2025

Uh oh!

MarcusDunn commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kusaanko commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MarcusDunn commented Nov 13, 2025 •

edited

Loading