Skip to content

(bug): S3 Cache Key Ignores max_file_size Parameter, Leading to Stale Results #531

@tony-liux

Description

@tony-liux

Which interface did you use?

Web UI

Repository URL (if public)

https://gitingest.com/man-group/dtale/tree/master/dtale/dash_application

Git host

GitHub (github.com)

Other Git host

No response

Repository visibility

public

Commit, branch, or tag

default branch

Did you ingest the full repository or a subdirectory?

subdirectory

Operating system

Windows

Browser (Web UI only)

Chrome

Other browser

No response

Gitingest version

No response

Python version

No response

Bug description

The S3 caching mechanism does not include the max_file_size parameter in its cache key generation.

This causes a significant issue: if a user performs an ingest with one file size limit, and then performs a second ingest on the same repository with a different file size limit, the system incorrectly returns the cached result from the first request.

The user sees stale data and is unable to get an updated digest reflecting their new settings, which can be very confusing. The root cause appears to be in server/s3_utils.py, where the hash for the cache key (s3_file_path) is generated without considering the file size limit.

Steps to reproduce

Navigate to the Gitingest Web UI using this URL: https://gitingest.com/man-group/dtale/tree/master/dtale/dash_application

Observe that the "Include files under" slider is at its default position, corresponding to a low value like 50kB.

Click the "Ingest" button and wait for the results to load.

In the "Directory Structure" output, notice that the file dtale/dash_application/layout/layout.py (which is 104kB) is correctly excluded because it is larger than the 50kB limit.

Now, move the "Include files under" slider all the way to the right, to a high value (e.g., 1MB).

Click the "Ingest" button again.

Expected behavior

After the second ingest with the increased file size limit, the system should re-process the repository. The new "Directory Structure" should now include the file dtale/dash_application/layout/layout.py, as its size (104kB) is well within the new 1MB limit.

Actual behavior

The results from the second ingest are identical to the first. The "Directory Structure" still excludes dtale/dash_application/layout/layout.py, incorrectly ignoring the user's updated file size setting. The application returns a stale, cached result instead of processing the request with the new parameters.

Additional context, logs, or screenshots

This bug can be fixed by updating the generate_s3_file_path function in src/server/s3_utils.py to include the max_file_size (in KB) as part of the string that is hashed to generate the cache key.y

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions