Skip to content

Conversation

@devnexen
Copy link
Contributor

@devnexen devnexen commented Jul 2, 2025

No description provided.


nvml_try(sym(self.device, &mut hints, ptr::null_mut()))?;

hints.checked_shr(1);
Copy link
Contributor

@swlynch99 swlynch99 Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a typo. Did you mean to do this instead? I'm not familiar enough with this method to know if we should be shifting the hint flags here.

hints = hints.checked_shr(1);

Copy link
Contributor Author

@devnexen devnexen Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I just forgot to treat this part. You re almost right, since it s the checked flavor it Option<Self> return type

@devnexen devnexen force-pushed the mps_compute_running_processes branch 2 times, most recently from a97ff1b to 3b221e2 Compare July 2, 2025 23:14
Copy link
Contributor

@swlynch99 swlynch99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at bit more closely this seems completely wrong. See my review comments for what needs to be fixed.

The errors here are a bit weird to me. If you're using AI to generate these PRs that's completely fine, but please validate that the resulting code is using the APIs correctly by looking at the docs before submitting them.

Comment on lines 651 to 681
unsafe {
let mut hints: c_uint = 0;

nvml_try(sym(self.device, &mut hints, ptr::null_mut()))?;

let mut hints_shr = match hints.checked_shr(1) {
Some(h) => h,
None => hints,
};
let mut processes: Vec<nvmlProcessInfo_t> = Vec::with_capacity(hints_shr as usize);

nvml_try(sym(self.device, &mut hints_shr, processes.as_mut_ptr()))?;

processes.truncate(hints_shr as usize);
Ok(processes.into_iter().map(ProcessInfo::from).collect())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bit with checked_shr confused me so I went and took a look at the docs to confirm. This section is wrong.

The way this method is supposed to be used seems to be like this:

  • Call the function with count set to 0 and infos set to NULL
  • If the call doesn't error then we're done and know there are no processes, so can return an empty Vec
  • Otherwise it should return NVML_ERROR_INSUFFICIENT_SIZE and update count to the number of processes
  • You should then be using count to allocate an array and passing that in for a second call so that it stores the data into the array.
  • Finally, truncate does not extend vector, you need set_len for that.

So in order:

  • hints is absolutely the wrong variable name. Use len.
  • Drop all the shifting, it makes no sense.
  • You need to handle the relevant errors from the first call to nvmlDeviceGetMPSComputeRunningProcesses_v3. If it succeeds you should just be returning an empty Vec.
  • truncate does not do what you want here, use set_len instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points !

Supports Volta or newer fully supported devices.
*/
#[doc(alias = "nvmlDeviceGetiMPSComputeRunningProcesses_v3")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a typo here

Suggested change
#[doc(alias = "nvmlDeviceGetiMPSComputeRunningProcesses_v3")]
#[doc(alias = "nvmlDeviceGetMPSComputeRunningProcesses_v3")]

@devnexen devnexen force-pushed the mps_compute_running_processes branch from 3b221e2 to 94f8239 Compare July 3, 2025 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants