-
Notifications
You must be signed in to change notification settings - Fork 118
Add ability to select backend devices #861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
MarcusDunn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great addition, two small changes I would like to see before merging though.
Replaces the From<i32> implementation for LlamaSplitMode with TryFrom<i32>, returning a custom LlamaSplitModeParseError on invalid values. Updates LlamaModelParams::split_mode to return a Result, improving error handling for unknown split modes.
MarcusDunn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a suggestion to clean up docs. Rest looks good.
|
I'll need https://github.com/utilityai/llama-cpp-rs/actions/runs/19287713535/job/55222838090?pr=861 to pass, then I'm good to merge. |
Replaces enum variant values for LlamaSplitMode with explicit integers (0, 1, 2) instead of referencing llama_cpp_sys_2 constants. Adds TryFrom<u32> implementation for LlamaSplitMode to improve type conversion and error handling. Updates documentation and error types for clarity.
|
I actually wanted to use LLAMA constants, but since they seem to change between i32 and u32 depending on the platform, I decided to use the values directly instead. |
|
you can use |
Replaces hardcoded integer values in the LlamaSplitMode enum and its conversions with corresponding constants from llama_cpp_sys_2. Adds From<LlamaSplitMode> for u32 and updates TryFrom implementations for better consistency with external definitions.
Introduces constants for split mode values with explicit clippy lint allowances and updates the LlamaSplitMode enum to use these constants. Refactors TryFrom implementations for i32 and u32 to improve error handling and type safety, and makes LlamaSplitModeParseError's field public.
Cleaned up the LlamaSplitMode enum and related conversions by removing redundant clippy allow attributes. Also simplified some type conversions for clarity.
|
Cast is not allowed in match statement, and I can't use it in if statement because rustc can't infer the type. The way generates warnings by clippy, so I decided to introduce constants in params.rs to convert to i8 at compile time. |
Update the doc test for LlamaModelParams to assert that split_mode returns Ok(LlamaSplitMode::Layer) instead of just LlamaSplitMode::Layer, reflecting the actual return type.
Now, LlamaModelParams has a field
main_gpubut it is not utilized because we can't setsplit_modetoLLAMA_SPLIT_MODE_NONE.This PR makes
split_modesettable.Moreover, this PR allows you to specify which backend devices to use in safe Rust.
The GGML backend device indices are not modified until you unload ggml backends, so the indices are used to specify devices.