Skip to content

Conversation

@casteryh
Copy link
Contributor

@casteryh casteryh commented Nov 5, 2025

Summary

  • Added three new metrics to track the policy age of episodes that are actually sampled from the replay buffer
  • buffer/sample/avg_sampled_policy_age: Average age of sampled episodes
  • buffer/sample/max_sampled_policy_age: Maximum age of sampled episodes
  • buffer/sample/min_sampled_policy_age: Minimum age of sampled episodes

Motivation

This is distinct from the existing buffer/evict/avg_policy_age metric which tracks the age of all episodes remaining in the buffer after eviction. The new metrics provide visibility into whether training is using fresh data (low ages) or stale data (high ages) at sampling time.

Test plan

  • Ran existing unit tests: python -m pytest tests/unit_tests/test_replay_buffer.py -v
  • All 8 tests passed

WARNING: Haven't actually run it since I don't want to kill the job that's been running for 2 days on my devgpu

This adds three new metrics to track the policy age of episodes that are actually sampled from the replay buffer:
- buffer/sample/avg_sampled_policy_age: Average age of sampled episodes
- buffer/sample/max_sampled_policy_age: Maximum age of sampled episodes
- buffer/sample/min_sampled_policy_age: Minimum age of sampled episodes

This is distinct from the existing buffer/evict/avg_policy_age metric which tracks the age of all episodes remaining in the buffer after eviction. The new metrics provide visibility into whether training is using fresh data (low ages) or stale data (high ages) at sampling time.

Test Plan:
- Ran existing unit tests: python -m pytest tests/unit_tests/test_replay_buffer.py -v
- All 8 tests passed
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 5, 2025
@casteryh
Copy link
Contributor Author

casteryh commented Nov 5, 2025

cc @wukaixingxp this might help with monitoring your training

@felipemello1
Copy link
Contributor

hey @casteryh , lets wait until you run it to merge? Also, should we then delete the other metric? it seems that maybe its not as useful?

@casteryh
Copy link
Contributor Author

casteryh commented Nov 5, 2025

hey @casteryh , lets wait until you run it to merge?

there is no rush merging it. just to make this available so @wukaixingxp can cherry-pick and debug his training jobs.

Also, should we then delete the other metric? it seems that maybe its not as useful?

Yeah it makes sense to me. What you think @joecummings ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants