Skip to content

Conversation

@kusaanko
Copy link
Contributor

Now, LlamaModelParams has a field main_gpu but it is not utilized because we can't set split_mode to LLAMA_SPLIT_MODE_NONE.

This PR makes split_mode settable.
Moreover, this PR allows you to specify which backend devices to use in safe Rust.

The GGML backend device indices are not modified until you unload ggml backends, so the indices are used to specify devices.

Copy link
Contributor

@MarcusDunn MarcusDunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great addition, two small changes I would like to see before merging though.

Replaces the From<i32> implementation for LlamaSplitMode with TryFrom<i32>, returning a custom LlamaSplitModeParseError on invalid values. Updates LlamaModelParams::split_mode to return a Result, improving error handling for unknown split modes.
Copy link
Contributor

@MarcusDunn MarcusDunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a suggestion to clean up docs. Rest looks good.

@MarcusDunn
Copy link
Contributor

Replaces enum variant values for LlamaSplitMode with explicit integers (0, 1, 2) instead of referencing llama_cpp_sys_2 constants. Adds TryFrom<u32> implementation for LlamaSplitMode to improve type conversion and error handling. Updates documentation and error types for clarity.
@kusaanko
Copy link
Contributor Author

I actually wanted to use LLAMA constants, but since they seem to change between i32 and u32 depending on the platform, I decided to use the values directly instead.

@kusaanko kusaanko requested a review from MarcusDunn November 13, 2025 08:19
@MarcusDunn
Copy link
Contributor

MarcusDunn commented Nov 13, 2025

you can use llama_cpp_sys_2::LLAMA_SPLIT_MODE_NONE as _ to cast to the correct platform type, you can implement TryFrom for both i32 and u32 as well.

Replaces hardcoded integer values in the LlamaSplitMode enum and its conversions with corresponding constants from llama_cpp_sys_2. Adds From<LlamaSplitMode> for u32 and updates TryFrom implementations for better consistency with external definitions.
Introduces constants for split mode values with explicit clippy lint allowances and updates the LlamaSplitMode enum to use these constants. Refactors TryFrom implementations for i32 and u32 to improve error handling and type safety, and makes LlamaSplitModeParseError's field public.
Cleaned up the LlamaSplitMode enum and related conversions by removing redundant clippy allow attributes. Also simplified some type conversions for clarity.
@kusaanko
Copy link
Contributor Author

Cast is not allowed in match statement, and I can't use it in if statement because rustc can't infer the type.

The way generates warnings by clippy, so I decided to introduce constants in params.rs to convert to i8 at compile time.

Update the doc test for LlamaModelParams to assert that split_mode returns Ok(LlamaSplitMode::Layer) instead of just LlamaSplitMode::Layer, reflecting the actual return type.
@kusaanko kusaanko requested a review from MarcusDunn November 13, 2025 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants