-
Notifications
You must be signed in to change notification settings - Fork 235
ENH] Change default dataset download path to user home directory #3104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
|
The CI failures is common to all the PRs. It would be fixed. |
|
Was there an issue with this PR? |
Hi, thanks for following up @MatthewMiddlehurst I closed the previous PR because, after reviewing it again, I realised that the change I made (replacing The real problem is that downloaded TSC datasets are currently saved inside the package directory ( For the proper fix, I’m planning to move the default dataset storage location to a user-writable directory using Before I start the new PR, could you please confirm that I’ll proceed as soon as I have your confirmation. |
|
It would be better to edit your current PR rather than create a new one for changes like this. Rather than AppData it should probably be wherever the code is run from right? i.e. if you run If you don't mind answering, why did you initially only change the method to find the path? Just curious whether it was an problem with the issue. |
I’m happy to continue with the existing PR instead of opening a new one. I closed it only because I thought the fix was incomplete and possibly misleading. About the default location: I’m happy to implement this and I’ll update the PR accordingly. |
Initially, I interpreted the issue as primarily a problem with how the dataset directory was being resolved, not where it should live. The use of So my first instinct was to fix the path resolution method to reduce those inconsistencies. Only after digging deeper into the datasets module did I realize the real problem is not about resolution, but about location. So my first PR only improved the mechanics of path construction, not the logic of choosing the correct base directory. Once I connected that the issue is actually about selecting a new default storage root, I realized the initial fix wasn’t addressing the underlying problem so I stepped back to rethink the design. |
|
Interesting to hear that other packages put it in AppData. May be worth asking on Slack if there is a preference? I think for now if you want to edit the PR just do it locally as its an improvement over the current method, and we can change it later if necessary. |
a042800 to
5116a3b
Compare
c72059b to
bf3b0cb
Compare
[ENH] Change default dataset download path to user home directory
What does this implement/fix?
Currently,
aeondownloads datasets into the installed package directory (site-packages/venv) by default. This causes issues with virtual environment persistence, permissions, and makes it difficult for users to locate the downloaded data.This change updates the default download behavior to use the user's home directory (
~/.aeon/data) while preserving backward compatibility for bundled datasets used in CI/tests.Updated files:
aeon/datasets/_data_loaders.py: Updated logic in_load_tsc_dataset,load_classification,load_forecasting,load_regression, anddownload_all_regressionto prioritize bundled data, then default to the user home directory. Updated_download_and_extractto support this default.aeon/datasets/tests/test_data_loaders.py: Updatedshutil.rmtreecleanup calls toignore_errors=Trueto prevent test failures when the deprecatedlocal_datafolder is not created.Summary
The logic for determining
local_module(where data is loaded/saved) has been updated with a hybrid check:aeon/datasets/data). If found (e.g.,UnitTest), it uses that path. This ensures CI and unit tests remain green.Path.home() / ".aeon" / "data".extract_path, that path is respected and used.Does your contribution introduce a new dependency?
No
Any other comments?
I have verified this locally by running
load_gunpoint()(which now correctly downloads to~/.aeon) and by running the unit testspytest aeon/datasets/tests/test_data_loaders.py(which pass).PR checklist