Skip to content

Conversation

@xqqp
Copy link
Contributor

@xqqp xqqp commented Nov 16, 2025

Updating the max cost of the posting list cache is not wired up. It just calls an empty function. This PR adds a method to MemoryLayer, that allows updating the max cost of its cache. The old function is deleted and the function call is replaced with a call to the new method.

@xqqp xqqp force-pushed the fix_updating_max_cost branch from ac00282 to 9e1fba4 Compare November 21, 2025 12:11
@xqqp xqqp force-pushed the fix_updating_max_cost branch from 9e1fba4 to 7a51a94 Compare November 25, 2025 22:40
@matthewmcneely
Copy link
Contributor

matthewmcneely commented Dec 1, 2025

Oh wow, good catch. So the updateConfig in the GraphQL admin api was basically a no-op?

Edit: I see now that it was max cost that was not getting updated, not the other configs.

@matthewmcneely matthewmcneely merged commit 68c24d5 into dgraph-io:main Dec 1, 2025
10 of 11 checks passed
@xqqp xqqp deleted the fix_updating_max_cost branch December 1, 2025 20:08
@RJKeevil
Copy link
Contributor

RJKeevil commented Dec 2, 2025

My app has a mode where it can do a full linear scan of all nodes to check consistency etc. I've notice that even with this change, if I set posting cache to anything greater than 0, my alphas mem usage goes from a steady 1-2G to 10G (container limit) and then restarts. My cache is set to 4G so I beleive there is one more issue hiding here preventing the cache to be sized appropriately?

@matthewmcneely
Copy link
Contributor

even with this change

@RJKeevil Just to be clear, you're trying from the current head of the main branch, right? #9515 (merged into main a few days ago) comprises the actual fix.

@RJKeevil
Copy link
Contributor

RJKeevil commented Dec 2, 2025

@matthewmcneely Yep im running direct from main, hash 68c24d5. This isnt an immediate problem for me as i run with the posting cache off currently, but just a heads up that I think theres another leak in there somewhere in this cache.

The code that triggers it runs through all nodes of a type in ascending uid order, in batches of 250. The uid batches are passed to another routine that fetches all predicates for this uid, returning the complete node .

@matthewmcneely
Copy link
Contributor

I think theres another leak

Right, but as you probably know the posting cache is only part of the memory that Dgraph allocates. I've ran tests that show the fix in #9515 is making things better from that perspective. But it would be good to understand your "full linear scan" details. Maybe open up a new topic thread in our new, spiffy Discussions?

@xqqp
Copy link
Contributor Author

xqqp commented Dec 2, 2025

@RJKeevil Can you collect and share a heap profile of an alpha node when it has high memory consumption? Below an example request for that.

curl http://<alpha_host>:8080/debug/pprof/heap > heap.out

Please also post the output of the command below.

dgraph version

Also, memory calculation on Linux is not straightforward, your container runtime might determine used memory different from other tools. I suggest you experiment with different cache sizes, for example size-mb=2048.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants