-
Notifications
You must be signed in to change notification settings - Fork 14k
Win: Fix std::fs::rename failing on Windows Server by attempting the non-atomic rename first #138133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
|
☔ The latest upstream changes (presumably #138208) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Ok, I've been testing this on many Windows versions and on different filesystem. I think the better approach is to test for the following error codes:
|
What are the non-atomic semantics on windows? Unix software generally expects renames to be atomic. |
|
@rustbot author |
|
Reminder, once the PR becomes ready for a review, use |
|
@Fulgen301 |
|
I'll fix it up before the week is over. |
In addition to ERROR_INVALID_PARAMETER for Windows versions older than Windows 10 1607, some Windows versions may also fail with ERROR_INVALID_FUNCTION or ERROR_NOT_SUPPORTED to indicate that atomic rename is not supported. Fallback to FileRenameInfo in those cases.
ChrisDenton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do a proper review when I have some time to do some testing on various Windows versions, filesystems, etc. But a few quick observations:
| // This is a dynamically sized struct so we need to get the position of the last field to calculate the actual size. | ||
| let Ok(new_len_without_nul_in_bytes): Result<u32, _> = ((new.count_bytes() - 1) * 2).try_into() | ||
| else { | ||
| return Err(WinError::INVALID_PARAMETER).io_result(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably ERROR_FILENAME_EXCED_RANGE would be better or else a custom error message. Even if it's not strictly what happened before.
| Err(err) if err.raw_os_error() == Some(c::ERROR_INVALID_PARAMETER as _) => None, | ||
| Err(err) => Some(Err(err)), | ||
| } | ||
| .unwrap_or_else(|| create_file(0, 0))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the error codes that are handled I worry this doesn't account for reparse points that cannot be opened, not because they're aren't supported but because they cannot be interpreted by CreateFile. E.g. app execution aliases aren't actually links in the filesystem sense and can only be interpreted by CreateProcess. We should still be able to rename them though.
| }; | ||
|
|
||
| unsafe { | ||
| dealloc(file_rename_info.cast::<u8>(), layout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now fairly distant from the alloc. While nothing returns or panics before this atm, I wouldn't be surprised if some refactoring in the future missed this subtlety. So I would be happier with a drop guard or some other custom struct that is guaranteed to free memory wherever the scope ends.
|
@Fulgen301 any updates on answering the initial review? thanks |
Might close bazelbuild#17192 Problem: Bazel fails completely on Windows when using filesystems that don't support junction/reparse point operations (e.g., virtiofs, VirtualBox shared folders, network drives, RAM disks). Two failure modes occur: 1. `ReadSymlinkOrJunction` fails during path resolution with: "Cannot read link: DeviceIoControl: Incorrect function" This causes build analysis to abort completely. 2. `CreateJunction` fails when creating convenience symlinks with: "Cannot create junction: DeviceIoControl: Incorrect function" This produces cryptic error messages. Both fail because `DeviceIoControl` returns `ERROR_INVALID_FUNCTION` when the filesystem doesn't implement `FSCTL_GET_REPARSE_POINT` or `FSCTL_SET_REPARSE_POINT` operations. Proposed slution: Handle `ERROR_INVALID_FUNCTION` gracefully by treating it as a "not supported" condition rather than a fatal error: 1. in `ReadSymlinkOrJunction` (`file.cc`:592): return `kNotALink` instead of `kError` when `ERROR_INVALID_FUNCTION` occurs. This allows path resolution to continue for non-symlink paths on unsupported filesystems. 2. in `CreateJunction` (`file.cc`:461): return new `kNotSupported` result code when `ERROR_INVALID_FUNCTION` occurs. This produces clear "filesystem does not support junctions" warnings instead of cryptic errors, and doesn't abort the build. This follows the try-first, fallback-on-error pattern (EAFP) used by other major projects when handling unsupported filesystem operations. External references: - Rust (rust-lang/rust#138133): checks `ERROR_INVALID_FUNCTION`, `ERROR_NOT_SUPPORTED`, and `ERROR_INVALID_PARAMETER` for filesystem operation fallbacks in `std::fs::rename`. - Microsoft STL (microsoft/STL#2077): handles junctions and reparse point errors including `ERROR_INVALID_PARAMETER` with robust fallback logic in `filesystem.cpp`. - Go (golang/go#20506): uses fallback strategies when symlink APIs are unavailable on different Windows versions. - WinFsp (winfsp/winfsp#88): documents that `ERROR_INVALID_FUNCTION` indicates `STATUS_NOT_IMPLEMENTED` for unsupported operations. - Microsoft Learn: recommends checking `FILE_SUPPORTS_REPARSE_POINTS` flag via `GetVolumeInformation`, but try-catch approach is simpler and handles edge cases where detection succeeds but operations fail. Impact: - enables Bazel to work on virtiofs, VirtualBox shared folders, RAM disks, and other filesystems that don't support Windows junction operations - transforms these environments from completely broken to usable - convenience symlinks (bazel-bin, bazel-out, etc.) won't be created on these filesystems - users must navigate to actual output paths - builds proceed with clear warnings instead of aborting with errors Limitations: This is a best-effort workaround that trades convenience for functionality. Users lose workspace convenience symlinks but gain the ability to build. Full junction support would require filesystem- level changes (e.g., virtiofs driver improvements). Testing: On Windows 11 VM with virtiofs-mounted folder: builds complete with expected warnings about unsupported junctions.
Fixes bazelbuild#17192 **Problem:** Bazel fails completely on Windows when using filesystems that don't support junction/reparse point operations (e.g., virtiofs, VirtualBox shared folders, network drives, RAM disks). The fatal error occurs when `ReadSymlinkOrJunction` fails during path resolution (e.g., when Starlark code calls `.realpath`): "Cannot read link: DeviceIoControl: Incorrect function" This causes build analysis to abort completely. Additionally, `CreateJunction` failures when creating convenience symlinks produce cryptic error messages, though these were already non-fatal warnings. Both fail because `DeviceIoControl` returns `ERROR_INVALID_FUNCTION` when the filesystem doesn't implement `FSCTL_GET_REPARSE_POINT` or `FSCTL_SET_REPARSE_POINT` operations. **Solution:** Handle `ERROR_INVALID_FUNCTION` gracefully by treating it as a "not supported" condition rather than a fatal error: 1. in `ReadSymlinkOrJunction` (`file.cc`:592): return `kNotALink` instead of `kError` when `ERROR_INVALID_FUNCTION` occurs. This allows path resolution to continue for non-symlink paths on unsupported filesystems. **This is the critical fix that enables builds to succeed.** 2. in `CreateJunction` (`file.cc`:461): return new `kNotSupported` result code when `ERROR_INVALID_FUNCTION` occurs. This produces clearer "filesystem does not support junctions" warnings instead of cryptic "Incorrect function" messages. This improves UX but doesn't change behavior (these failures were already non-fatal). This follows the try-first, fallback-on-error pattern (EAFP) used by other major projects when handling unsupported filesystem operations. **References to similar solutions in OSS projects:** - Rust (rust-lang/rust#138133): checks `ERROR_INVALID_FUNCTION`, `ERROR_NOT_SUPPORTED`, and `ERROR_INVALID_PARAMETER` for filesystem operation fallbacks in `std::fs::rename`. - Rust (rust-lang/rust#43801): reduced symlink usage in build system after encountering os error 1 on VirtualBox shared folders. - Microsoft STL (microsoft/STL#2077): handles junctions and reparse point errors including `ERROR_INVALID_PARAMETER` with robust fallback logic in `filesystem.cpp`. - Go (golang/go#20506): uses fallback strategies when symlink APIs are unavailable on different Windows versions. - WinFsp (winfsp/winfsp#88): documents that `ERROR_INVALID_FUNCTION` indicates `STATUS_NOT_IMPLEMENTED` for unsupported operations. - Microsoft Learn: recommends checking `FILE_SUPPORTS_REPARSE_POINTS` flag via `GetVolumeInformation`, but try-catch approach is simpler and handles edge cases where detection succeeds but operations fail. **Impact:** - enables Bazel builds on virtiofs, VirtualBox shared folders, RAM disks, and other filesystems that don't support Windows junction operations - transforms these environments from completely broken to usable - convenience symlinks (bazel-bin, bazel-out, etc.) still won't be created, but now with clearer error messages - users must navigate to actual output paths in the output_base **Limitations:** This is a best-effort workaround that trades convenience for functionality. Users lose workspace convenience symlinks but gain the ability to build. Full junction support would require filesystem- level changes (e.g., virtiofs driver improvements). **Testing:** Tested on Windows 11 VM with host directory mounted via virtiofs, with [rules_pkg](https://github.com/bazelbuild/rules_pkg/blob/6cdaba69ee76463b2b8e97e8d243dbb6115c3aee/toolchains/git/git_configure.bzl#L40). Before change: build analysis aborted with "Cannot read link" fatal error. After change: builds complete successfully with clearer warnings about unsupported junctions for convenience symlinks.
Might close bazelbuild#17192. **Problem:** Bazel fails completely on Windows when using filesystems that don't support junction/reparse point operations (e.g., virtiofs, VirtualBox shared folders, network drives, RAM disks). The fatal error occurs when `ReadSymlinkOrJunction` fails during path resolution (e.g., when Starlark code calls `.realpath`): "Cannot read link: DeviceIoControl: Incorrect function" This causes build analysis to abort completely. Additionally, `CreateJunction` failures when creating convenience symlinks produce cryptic error messages, though these were already non-fatal warnings. Both fail because `DeviceIoControl` returns `ERROR_INVALID_FUNCTION` when the filesystem doesn't implement `FSCTL_GET_REPARSE_POINT` or `FSCTL_SET_REPARSE_POINT` operations. **Solution:** Handle `ERROR_INVALID_FUNCTION` gracefully by treating it as a "not supported" condition rather than a fatal error: 1. in `ReadSymlinkOrJunction` (`file.cc`:592): return `kNotALink` instead of `kError` when `ERROR_INVALID_FUNCTION` occurs. This allows path resolution to continue for non-symlink paths on unsupported filesystems. 2. in `CreateJunction` (`file.cc`:461): return new `kNotSupported` result code when `ERROR_INVALID_FUNCTION` occurs. This produces clearer "filesystem does not support junctions" warnings instead of cryptic "Incorrect function" messages. This improves UX but doesn't change behavior (these failures were already non-fatal). This follows the try-first, fallback-on-error pattern (EAFP) used by other major projects when handling unsupported filesystem operations. **Prior art:** - Rust (rust-lang/rust#138133): checks `ERROR_INVALID_FUNCTION`, `ERROR_NOT_SUPPORTED`, and `ERROR_INVALID_PARAMETER` for filesystem operation fallbacks in `std::fs::rename`. - Rust (rust-lang/rust#43801): reduced symlink usage in build system after encountering os error 1 on VirtualBox shared folders. - Microsoft STL (microsoft/STL#2077): handles junctions and reparse point errors including `ERROR_INVALID_PARAMETER` with robust fallback logic in `filesystem.cpp`. - Go (golang/go#20506): uses fallback strategies when symlink APIs are unavailable on different Windows versions. - WinFsp (winfsp/winfsp#88): documents that `ERROR_INVALID_FUNCTION` indicates `STATUS_NOT_IMPLEMENTED` for unsupported operations. - Microsoft Learn: recommends checking `FILE_SUPPORTS_REPARSE_POINTS` flag via `GetVolumeInformation`, but try-catch approach is simpler and handles edge cases where detection succeeds but operations fail. **Impact:** - enables Bazel builds on virtiofs, VirtualBox shared folders, RAM disks, and other filesystems that don't support Windows junction operations - convenience symlinks (bazel-bin, bazel-out, etc.) still won't be created, but now with clearer error messages **Limitations:** Full junction support would require filesystem-level changes (e.g., virtiofs driver improvements). **Testing:** Tested on Windows 11 VM with host directory mounted via virtiofs, with [rules_pkg](https://github.com/bazelbuild/rules_pkg/blob/6cdaba69ee76463b2b8e97e8d243dbb6115c3aee/toolchains/git/git_configure.bzl#L40). Before change: build analysis aborted with "Cannot read link" fatal error. After change: builds complete successfully with clearer warnings about unsupported junctions for convenience symlinks.
**Problem:** Bazel fails completely on Windows when using filesystems that don't support junction/reparse point operations (e.g., virtiofs, VirtualBox shared folders, network drives, RAM disks). The fatal error occurs when `ReadSymlinkOrJunction` fails during path resolution (e.g., when Starlark code calls `.realpath`): "Cannot read link: DeviceIoControl: Incorrect function" This causes build analysis to abort completely. Additionally, `CreateJunction` failures when creating convenience symlinks produce cryptic error messages, though these were already non-fatal warnings. Both fail because `DeviceIoControl` returns `ERROR_INVALID_FUNCTION` when the filesystem doesn't implement `FSCTL_GET_REPARSE_POINT` or `FSCTL_SET_REPARSE_POINT` operations. **Solution:** Handle `ERROR_INVALID_FUNCTION` gracefully by treating it as a "not supported" condition rather than a fatal error: 1. in `ReadSymlinkOrJunction` (`file.cc`:592): return `kNotALink` instead of `kError` when `ERROR_INVALID_FUNCTION` occurs. This allows path resolution to continue for non-symlink paths on unsupported filesystems. 2. in `CreateJunction` (`file.cc`:461): return new `kNotSupported` result code when `ERROR_INVALID_FUNCTION` occurs. This produces clearer "filesystem does not support junctions" warnings instead of cryptic "Incorrect function" messages. This improves UX but doesn't change behavior (these failures were already non-fatal). This follows the try-first, fallback-on-error pattern (EAFP) used by other major projects when handling unsupported filesystem operations. **Prior art:** - Rust (rust-lang/rust#138133): checks `ERROR_INVALID_FUNCTION`, `ERROR_NOT_SUPPORTED`, and `ERROR_INVALID_PARAMETER` for filesystem operation fallbacks in `std::fs::rename`. - Rust (rust-lang/rust#43801): reduced symlink usage in build system after encountering os error 1 on VirtualBox shared folders. - Microsoft STL (microsoft/STL#2077): handles junctions and reparse point errors including `ERROR_INVALID_PARAMETER` with robust fallback logic in `filesystem.cpp`. - Go (golang/go#20506): uses fallback strategies when symlink APIs are unavailable on different Windows versions. - WinFsp (winfsp/winfsp#88): documents that `ERROR_INVALID_FUNCTION` indicates `STATUS_NOT_IMPLEMENTED` for unsupported operations. - Microsoft Learn: recommends checking `FILE_SUPPORTS_REPARSE_POINTS` flag via `GetVolumeInformation`, but try-catch approach is simpler and handles edge cases where detection succeeds but operations fail. **Impact:** - enables Bazel builds on virtiofs, VirtualBox shared folders, RAM disks, and other filesystems that don't support Windows junction operations - convenience symlinks (bazel-bin, bazel-out, etc.) still won't be created, but now with clearer error messages **Limitations:** Full junction support would require filesystem-level changes (e.g., virtiofs driver improvements). **Testing:** Tested on Windows 11 VM with host directory mounted via virtiofs, with [rules_pkg](https://github.com/bazelbuild/rules_pkg/blob/6cdaba69ee76463b2b8e97e8d243dbb6115c3aee/toolchains/git/git_configure.bzl#L40). Before change: build analysis aborted with "Cannot read link" fatal error. After change: builds complete successfully with clearer warnings about unsupported junctions for convenience symlinks.
**Problem:** Bazel fails completely on Windows when using filesystems that don't support junction/reparse point operations (e.g., virtiofs, VirtualBox shared folders, network drives, RAM disks). The fatal error occurs when `ReadSymlinkOrJunction` fails during path resolution (e.g., when Starlark code calls `.realpath`): "Cannot read link: DeviceIoControl: Incorrect function" This causes build analysis to abort completely. Additionally, `CreateJunction` failures when creating convenience symlinks produce cryptic error messages, though these were already non-fatal warnings. Both fail because `DeviceIoControl` returns `ERROR_INVALID_FUNCTION` when the filesystem doesn't implement `FSCTL_GET_REPARSE_POINT` or `FSCTL_SET_REPARSE_POINT` operations. **Solution:** Handle `ERROR_INVALID_FUNCTION` gracefully by treating it as a "not supported" condition rather than a fatal error: 1. in `ReadSymlinkOrJunction` (`file.cc`:592): return `kNotALink` instead of `kError` when `ERROR_INVALID_FUNCTION` occurs. This allows path resolution to continue for non-symlink paths on unsupported filesystems. 2. in `CreateJunction` (`file.cc`:461): return new `kNotSupported` result code when `ERROR_INVALID_FUNCTION` occurs. This produces clearer "filesystem does not support junctions" warnings instead of cryptic "Incorrect function" messages. This improves UX but doesn't change behavior (these failures were already non-fatal). This follows the try-first, fallback-on-error pattern (EAFP) used by other major projects when handling unsupported filesystem operations. **Prior art:** - Rust (rust-lang/rust#138133): checks `ERROR_INVALID_FUNCTION`, `ERROR_NOT_SUPPORTED`, and `ERROR_INVALID_PARAMETER` for filesystem operation fallbacks in `std::fs::rename`. - Microsoft STL (microsoft/STL#2077): handles junctions and reparse point errors including `ERROR_INVALID_PARAMETER` with robust fallback logic in `filesystem.cpp`. - Go (golang/go#20506): uses fallback strategies when symlink APIs are unavailable on different Windows versions. - WinFsp (winfsp/winfsp#88): documents that `ERROR_INVALID_FUNCTION` indicates `STATUS_NOT_IMPLEMENTED` for unsupported operations. - Microsoft Learn: recommends checking `FILE_SUPPORTS_REPARSE_POINTS` flag via `GetVolumeInformation`, but try-catch approach is simpler and handles edge cases where detection succeeds but operations fail. **Impact:** - enables Bazel builds on virtiofs, VirtualBox shared folders, RAM disks, and other filesystems that don't support Windows junction operations. - convenience symlinks (bazel-bin, bazel-out, etc.) still won't be created, but now with clearer error messages. **Limitations:** Full junction support would require filesystem-level changes (e.g., virtiofs driver improvements). **Testing:** Tested on Windows 11 VM with host directory mounted via virtiofs, with [rules_pkg](https://github.com/bazelbuild/rules_pkg/blob/6cdaba69ee76463b2b8e97e8d243dbb6115c3aee/toolchains/git/git_configure.bzl#L40). Before change: build analysis aborted with "Cannot read link" fatal error. After change: builds complete successfully with clearer warnings about unsupported junctions for convenience symlinks.
We previously attempted the atomic rename first, and fell back to trying a non-atomic rename if it wasn't supported. However, this does not work reliably on Windows Server - NTFS partitions created in Windows Server 2016 apparently need to be upgraded, and it outright fails with
ERROR_NOT_SUPPORTEDon ReFS on Windows Server 2022.This commit switches the two calls so that FileRenameInfo is tried first, and an atomic rename via FileRenameInfoEx is only attempted if the non-atomic rename fails.
Fix #137499.
Fix #137971.
This PR works similarly to #137528, but is a cleaner fix in my opinion and avoids additional syscalls to reopen the file if
MoveFileExfails.