Skip to content

Conversation

@ada-ggf25
Copy link

@ada-ggf25 ada-ggf25 commented Dec 1, 2025

Fixes #7877 and follows up on feedback from #7890.

Context

In #7890 it was introduced an automatic creation of the TMPDIR directory in _TempCacheDir so that datasets does not silently fall back to /tmp when TMPDIR points to a non‑existent path. This addressed the original “tempfile silently ignores TMPDIR” issue in #7877.

During review, it was pointed out by @stas00 (see review comment) that if the code treats TMPDIR as part of the public API, then failures to use it should fail loudly, not just emit warnings, because warnings are easy to miss in complex multi‑GPU setups.

This PR implements that stricter behaviour.

What this PR changes

_TempCacheDir behaviour In src/datasets/fingerprint.py:

  • Continue to:

    • Detect TMPDIR from the environment
    • Normalise the path
    • Auto‑create the directory when it does not exist
    • Pass the (validated) directory explicitly to tempfile.mkdtemp(...) so TMPDIR is honoured even if tempfile.gettempdir() was already cached
  • New behaviour (in response to review on #7890):

    • If TMPDIR is set, but the directory cannot be created, we now re‑raise an OSError with a clear, actionable message:
      • “TMPDIR is set to '…' but the directory does not exist and could not be created: … Please create it manually or unset TMPDIR to fall back to the default temporary directory.”
    • If TMPDIR is set but points to something that is not a directory, we also raise OSError with guidance:
      • “TMPDIR is set to '…' but it is not a directory. Please point TMPDIR to a writable directory or unset it to fall back to the default temporary directory.”
    • When TMPDIR is not set, behaviour is unchanged: we pass dir=None and let tempfile use the system default temp directory.

This aligns with the @stas00’s suggestion that TMPDIR should be treated as a strict API contract: if the user chooses a TMPDIR, we either use it or clearly fail, rather than silently falling back.

Tests

In tests/test_fingerprint.py:

  • Updated tests that previously expected warning‑and‑fallback to now expect hard failures:

    • test_temp_cache_dir_tmpdir_creation_failure
      • Uses unittest.mock.patch to force os.makedirs to raise OSError("Permission denied")
      • Asserts that constructing _TempCacheDir() raises OSError and that the message contains both “TMPDIR is set to” and “could not be created”
    • test_temp_cache_dir_tmpdir_not_directory
      • Points TMPDIR to a regular file and asserts that _TempCacheDir() raises OSError with a message mentioning “is not a directory”
  • Left the positive‑path tests in place:

    • test_temp_cache_dir_with_tmpdir_nonexistent – verifies that a non‑existent TMPDIR is created and used
    • test_temp_cache_dir_with_tmpdir_existing – verifies that an existing TMPDIR directory is used as the base for the temp cache dir
    • test_temp_cache_dir_without_tmpdir – verifies behaviour when TMPDIR is not set (default temp directory)
  • Kept the earlier fix to test_fingerprint_in_multiprocessing, which now uses Pool.map and asserts that fingerprints are stable across processes.

Rationale

  • Treating TMPDIR as part of the API for cache placement means:
    • Users can rely on it to move large temporary Arrow files away from small /tmp partitions.
    • Misconfigured TMPDIR should be immediately visible as a hard error, not as a warning lost among many logs.
  • The stricter failure mode matches the concern on #7890 that “warnings are very easy to miss in complex applications where there are already dozens of warnings multiplied by multiple GPU processes”.

Testing

  • pytest tests/test_fingerprint.py
  • make style
  • No new linter issues introduced.

…mpCacheDir

Enhanced the _TempCacheDir.__init__ method to properly respect and handle
the TMPDIR environment variable when creating temporary cache directories.

Changes:
- Add TMPDIR environment variable detection and validation
- Normalise paths to handle path resolution issues
- Auto-create TMPDIR directory if it doesn't exist to prevent silent
  fallback to default temporary directory
- Validate that TMPDIR is actually a directory before use
- Explicitly pass directory to mkdtemp to ensure TMPDIR is respected even
  if tempfile.gettempdir() was already called and cached
- Add appropriate logging for directory creation and fallback scenarios

This ensures that when TMPDIR is set, the temporary cache files are
created in the specified directory rather than silently falling back to
the system default temporary directory.
…mpCacheDir

Add test coverage for the improved TMPDIR environment variable handling
in the _TempCacheDir class. These tests verify the various scenarios
for TMPDIR usage and error handling.

Changes:
- Refactor test_fingerprint_in_multiprocessing to use Pool.map for
  cleaner test implementation
- Add test_temp_cache_dir_with_tmpdir_nonexistent to verify TMPDIR
  auto-creation when directory doesn't exist
- Add test_temp_cache_dir_with_tmpdir_existing to verify correct
  behaviour when TMPDIR exists and is valid
- Add test_temp_cache_dir_without_tmpdir to verify fallback to default
  temporary directory when TMPDIR is not set
- Add test_temp_cache_dir_tmpdir_creation_failure to verify graceful
  error handling and fallback when TMPDIR creation fails

These tests ensure that the TMPDIR improvements work correctly across
all scenarios and edge cases, including proper logging and fallback
behaviour.
Refine TMPDIR-related failure tests for _TempCacheDir to assert explicit
error conditions instead of fallback behaviour.

Changes:
- Update test_temp_cache_dir_tmpdir_creation_failure to use _TempCacheDir
  directly and assert that an OSError is raised with a clear TMPDIR
  context when directory creation fails
- Introduce test_temp_cache_dir_tmpdir_not_directory to verify that
  pointing TMPDIR at a non-directory raises an OSError with an
  informative error message

These tests better match the intended contract of _TempCacheDir by
ensuring invalid TMPDIR configurations fail loudly with descriptive
messages rather than silently falling back.
…loudly

Tighten TMPDIR handling in _TempCacheDir so that invalid configurations
raise clear errors instead of silently falling back to the default
temporary directory.

Changes:
- When TMPDIR points to a non-existent directory, raise an OSError with
  explicit guidance to create it manually or unset TMPDIR
- When TMPDIR points to a non-directory path, raise an OSError with
  guidance to point TMPDIR to a writable directory or unset it
- Remove previous warning-and-fallback behaviour to avoid masking
  configuration issues

This ensures that TMPDIR misconfigurations are surfaced early and
clearly, aligning runtime behaviour with the stricter expectations
codified in the new tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

work around tempfile silently ignoring TMPDIR if the dir doesn't exist

1 participant