Skip to content

Conversation

@casenave
Copy link
Member

@casenave casenave commented Nov 22, 2025

Checklist

  • Typing enforced
  • Documentation updated
  • Changelog updated
  • Tests and Example updates
  • Coverage should be 100%

Implemented a facotrization of function needed for all backends and refactored the HF_datasets backend while added an additional zarr backend.
The webdataset backend is not provided yet, since there are some unexpected difficulties ocurring when trying to set one tar per feature per sample: iterating over the dataset will not enable easily to group files per sample - must find an efiicient trick, maybe by storing a dict with key the sample id and value the list of tar files corresponding to the sample ? Will this be efficient for iteration and plaid sample reconstruction ?

Implemented:

Tests, examples and tutorials to come

🔗 Related issues

Closes #270, #277

@casenave casenave requested a review from a team as a code owner November 22, 2025 20:53
@casenave casenave marked this pull request as draft November 22, 2025 20:53
@codecov
Copy link

codecov bot commented Nov 22, 2025

Codecov Report

❌ Patch coverage is 38.63636% with 216 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/plaid/storage/reader.py 0.00% 144 Missing ⚠️
src/plaid/storage/writer.py 0.00% 64 Missing ⚠️
src/plaid/problem_definition.py 45.45% 6 Missing ⚠️
src/plaid/storage/__init__.py 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[STORAGE] Rename bridges to storage and organize for various storage solutions

3 participants