-
Notifications
You must be signed in to change notification settings - Fork 100
[FEATURE] uv provider
#164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Wauplin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @burtenshaw , thanks for the ping! Happy to review a PR :) I've left quite a lot of comments on how I would implement things if starting the project right now. Feel free to ignore when not applicable or if I did not take into account other relevant parts of the library. In general my main comment is that I always tend to reduce the number of available parameters ("if no-one asked for it, let's not implement it") and isolate logic ("let's not have Hub-logic or docker-logic in uv provider").
Given that it's the second provider implemented, I think it's good to take some time to think about which API makes the most sense for the future of the library.
Let me know if you have any questions / comments 🤗
|
|
||
| import requests | ||
|
|
||
| from .providers import ContainerProvider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since uv is not per-se a container, I would rename the abstract class RuntimeProvider (or simply Runtime?) and the abstract methods start, stop, and wait_for_ready (instead of start_container, etc.). I feel that semantically it would be more accurate.
(just my 2c, feel free to ignore 🤗 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you @Wauplin , we should clarify naming here. However, I'm reluctant to let this PR swell whilst we're trying to move fast.
For now, I've added a new ABC named RuntimeProvider which has no container specific methods and is used by the UVProvider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understandable! Works like this as well :)
|
|
||
|
|
||
| @dataclass | ||
| class UVProvider(ContainerProvider): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| class UVProvider(ContainerProvider): | |
| class UVRuntime(RuntimeProvider): |
(naming suggestion to be aligned with the "runtime/" folder)
| connect_host: Optional[str] = None, | ||
| extra_env: Optional[Dict[str, str]] = None, | ||
| **provider_kwargs: Any, | ||
| ) -> EnvClientT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this signature very hard to understand without context since it's mixing kwargs for docker and for uv. Also I don't think one needs use_docker, provider, and runner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be to remove provider and runner. Also remove project_url and connect_host. Have a single env (instead of env for docker and extra_env for uv). And use some typed dict + typing.overload so that IDEs can have correct autocompletion. Here is a simplified example:
from typing import Any, Dict, NotRequired, TypedDict, Unpack, overload
class DockerKwargs(TypedDict, total=False):
tag: NotRequired[str]
env: NotRequired[Dict[str, str]]
class UVProvider(TypedDict):
host: NotRequired[str]
port: NotRequired[int]
reload: NotRequired[bool]
timeout_s: NotRequired[float]
env: NotRequired[Dict[str, str]]
@overload
def from_hub(repo_id: str, *, use_docker: bool = True, **kwargs: Unpack[DockerKwargs]) -> str: ...
@overload
def from_hub(repo_id: str, *, use_docker: bool = False, **kwargs: Unpack[UVProvider]) -> str: ...
def from_hub(repo_id: str, *, use_docker: bool = False, **kwargs: Any) -> str:
raise NotImplementedError()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I'm not a fan of overloads and typed dict but it's the only solution I see to correctly document the signature while keeping a single method.
Another solution is to have from_hub_docker and from_hub_uv (more explicit but less elegant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I've simplified the signatures right down, but I haven't added type overloading in this PR.
src/core/http_env_client.py
Outdated
| base_url = uv_runner.start_container( | ||
| repo_id, | ||
| port=port, | ||
| env_vars=env_vars, | ||
| **non_docker_kwargs, | ||
| ) | ||
|
|
||
| try: | ||
| uv_runner.wait_for_ready(base_url, timeout_s=timeout_s) | ||
| except Exception: | ||
| uv_runner.stop_container() | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the runner is defined and instantiated, I feel that the start and wait calls should be made in the __init__ instead. This way you get the same behavior both in from_hub and from_docker. With current implementation, one stops the runner if failing to start, the other don't (which is not consistent).
src/core/http_env_client.py
Outdated
| uv_runner = runner or UVProvider( | ||
| repo_id=repo_id, | ||
| host=host, | ||
| port=port, | ||
| reload=reload, | ||
| project_url=project_url, | ||
| connect_host=connect_host, | ||
| extra_env=extra_env, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| uv_runner = runner or UVProvider( | |
| repo_id=repo_id, | |
| host=host, | |
| port=port, | |
| reload=reload, | |
| project_url=project_url, | |
| connect_host=connect_host, | |
| extra_env=extra_env, | |
| ) | |
| runner = UVProvider( | |
| project=f"git+https://huggingface.co/spaces/{repo_id}", | |
| host=host, | |
| port=port, | |
| reload=reload, | |
| env=env, | |
| ) |
With all the suggestions above, that would be the runner call (i.e. provide a project url, remove connect_host, remove project_url, rename extra_env to env, and remove runner). I don't think final user loose much in this simplification.
Co-authored-by: Lucain <lucainp@gmail.com>
This PR adds a uv provider to open env core which can be used to run envs on spaces without docker. The main intention here is to make a pure python option for notebook environments like colab.
This is a working example based on a specific env, which is the result of #160 .
We can interact directly with the uv provided env like so: