Skip to content

Conversation

@lmb
Copy link
Collaborator

@lmb lmb commented Oct 31, 2025

ebpf-go loads native images via ebpf_object_load_native_by_fds. Contrary to ebpf_object_load_native this function requires the caller to provide memory for map and program fds. If the caller doesn't provide enough space, EBPF_NO_MEMORY is returned along with a sizing hint. The caller is expected to retry with a larger buffer.

This behaviour currently breaks for two reasons: first, the error recovery path of _ebpf_object_load_native stops and deletes the service, but does not wait for deletion to complete. This is a problem because each native image can currently only be loaded once. This causes an error when ebpf-go retries the load with larger buffers. Second, the check which issues the EBPF_NO_MEMORY result happens at a point where we can't reliably delete the service since we don't know its name.

Fix this by waiting for a service to be deleted on the error path and by pushing the EBPF_NO_MEMORY condition down into _ebpf_object_load_native. The latter requires some creative semantics for count_of_maps and count_of_programs: they now specify the maximum number the caller is willing to accept.

@lmb
Copy link
Collaborator Author

lmb commented Oct 31, 2025

cc @ExceptionalHandler

ebpf-go loads native images via ebpf_object_load_native_by_fds.
Contrary to ebpf_object_load_native this function requires the
caller to provide memory for map and program fds. If the caller
doesn't provide enough space, EBPF_NO_MEMORY is returned along
with a sizing hint. The caller is expected to retry with a larger
buffer.

This behaviour currently breaks for two reasons: first, the error
recovery path of _ebpf_object_load_native stops and deletes the
service, but does not wait for deletion to complete. This is a
problem because each native image can currently only be loaded
once. This causes an error when ebpf-go retries the load with
larger buffers. Second, the check which issues the EBPF_NO_MEMORY
result happens at a point where we can't reliably delete the
service since we don't know its name.

Fix this by waiting for a service to be deleted on the error
path and by pushing the EBPF_NO_MEMORY condition down into
_ebpf_object_load_native. The latter requires some creative
semantics for count_of_maps and count_of_programs: they now
specify the maximum number the caller is willing to accept.
@lmb lmb force-pushed the load-native-wait-for-deletion branch from f5ec179 to ad9c218 Compare November 4, 2025 16:45
@lmb lmb marked this pull request as ready for review November 5, 2025 09:54
@lmb lmb enabled auto-merge November 5, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant