-
Notifications
You must be signed in to change notification settings - Fork 2k
[ENH]: Allow specifiying multiple filter keys in get_statistics #5963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Enable multi-key filtering in This PR extends Key Changes• Updated Affected Areas• This summary was automatically generated by @propel-code-bot |
d05aed4 to
99059fd
Compare
kylediaz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's gooo
chromadb/utils/statistics.py
Outdated
| collection: "Collection", stats_collection_name: str, key: Optional[str] = None | ||
| collection: "Collection", | ||
| stats_collection_name: str, | ||
| keys: Optional[List[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to OneOrMany imo, always annoying to single list an input
99059fd to
346ae89
Compare
cb626b1 to
9a11f92
Compare
| def get_statistics( | ||
| collection: "Collection", stats_collection_name: str, key: Optional[str] = None | ||
| collection: "Collection", | ||
| stats_collection_name: str, | ||
| keys: Optional[OneOrMany[str]] = None, | ||
| ) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Maintainability] Renaming the key parameter to keys is a breaking API change for any callers using keyword arguments. While the new name is more descriptive, this could be made backward-compatible to avoid breaking existing user code and provide a smoother transition.
If this breaking change is intentional, it should be clearly documented in the project's release notes. If not, you could consider handling the old key parameter with a deprecation warning.
Here's an example of how you could support both for a transition period by modifying the function signature and adding logic to the function body:
Signature:
def get_statistics(
collection: "Collection",
stats_collection_name: str,
keys: Optional[OneOrMany[str]] = None,
key: Optional[str] = None, # Deprecated
) -> Dict[str, Any]:Function Body Logic:
if key is not None:
import warnings
warnings.warn(
"The 'key' parameter is deprecated and will be removed in a future version. Please use 'keys' instead.",
DeprecationWarning,
stacklevel=2,
)
if keys is not None:
raise ValueError("Cannot provide both 'key' and 'keys' arguments.")
keys = keyContext for Agents
Renaming the `key` parameter to `keys` is a breaking API change for any callers using keyword arguments. While the new name is more descriptive, this could be made backward-compatible to avoid breaking existing user code and provide a smoother transition.
If this breaking change is intentional, it should be clearly documented in the project's release notes. If not, you could consider handling the old `key` parameter with a deprecation warning.
Here's an example of how you could support both for a transition period by modifying the function signature and adding logic to the function body:
**Signature:**
```python
def get_statistics(
collection: "Collection",
stats_collection_name: str,
keys: Optional[OneOrMany[str]] = None,
key: Optional[str] = None, # Deprecated
) -> Dict[str, Any]:
```
**Function Body Logic:**
```python
if key is not None:
import warnings
warnings.warn(
"The 'key' parameter is deprecated and will be removed in a future version. Please use 'keys' instead.",
DeprecationWarning,
stacklevel=2,
)
if keys is not None:
raise ValueError("Cannot provide both 'key' and 'keys' arguments.")
keys = key
```
File: chromadb/utils/statistics.py
Line: 127346ae89 to
9d76ffc
Compare
9a11f92 to
8822537
Compare
9d76ffc to
fe8a41a
Compare
954de4c to
bca5dfd
Compare
fe8a41a to
2c9c12a
Compare
| keys_list = maybe_cast_one_to_many(keys) | ||
|
|
||
| # Validate keys count to avoid issues with large $in queries | ||
| MAX_KEYS = 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Maintainability] Consider defining MAX_KEYS as a module-level constant (e.g., at the top of the file) instead of inside the function. This improves readability, makes it clear that this is a fixed configuration value, and avoids re-declaration on every function call.
Context for Agents
Consider defining `MAX_KEYS` as a module-level constant (e.g., at the top of the file) instead of inside the function. This improves readability, makes it clear that this is a fixed configuration value, and avoids re-declaration on every function call.
File: chromadb/utils/statistics.py
Line: 1882c9c12a to
fe8a41a
Compare
bca5dfd to
5e6d93b
Compare
fe8a41a to
409a3b0
Compare
5e6d93b to
41e5172
Compare
Merge activity
|
409a3b0 to
eb51da3
Compare

Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
pytestfor python,yarn testfor js,cargo testfor rustMigration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
Observability plan
What is the plan to instrument and monitor this change?
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?