Skip to content

Conversation

Copy link

Copilot AI commented Oct 4, 2025

What does this PR do?

Removes an outdated TODO comment from src/lightning/pytorch/loops/fit_loop.py that suggested moving the max_steps check inside the training loop.

Why is this change needed?

The TODO comment on line 179 suggested:

# TODO: Move track steps inside training loop and move part of these condition inside training loop

However, the current implementation is correct and the TODO is unnecessary because:

  1. global_step tracks steps across epochs: The global_step counter is cumulative across all epochs, not per-epoch, so it makes sense to check it at the fit loop level rather than within the epoch loop.

  2. Fit loop is the right place for this check: The fit_loop.done property evaluates when to leave the entire fit process. Since max_steps determines when the entire training should stop (not just a single epoch), checking it here is the appropriate design.

  3. Moving it would add unnecessary complexity: Relocating this check to the epoch loop would require introducing additional flag variables to communicate back to the fit loop when max_steps is reached, making the code more complex without any benefit.

Testing

The existing test suite validates this behavior:

  • test_fit_loop_done_log_messages - Tests the done property including the max_steps check
  • test_min_max_steps_epochs - Tests max_steps behavior across 7 different scenarios

All tests pass successfully with no regressions.

Fixes #<issue_number>

cc @lantiga @justusschock

Original prompt

This section details on the original issue you should resolve

<issue_title>Remove an unnecessary TODO in src/lightning/pytorch/loops/fit_loop.py</issue_title>
<issue_description>### Outline & Motivation

On line 178, there'a TODO item:

# TODO: Move track steps inside training loop and move part of these condition inside training loop
    stop_steps = _is_max_limit_reached(self.epoch_loop.global_step, self.max_steps)
    if stop_steps:
        rank_zero_info(f"`Trainer.fit` stopped: `max_steps={self.max_steps!r}` reached.")
        return True

Since global_step takes effect cross epoches, it's reasonable to perform this check here.
If we remove this check, then we need to add a flag variable, which is set to True when we hit max_steps in epoch_loop, which doesn't make things better.

Pitch

Remove this TODO to make code cleaner.

Additional context

No response

cc @lantiga @justusschock</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #20998

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

The TODO suggested moving max_steps check inside training loop, but this is
correctly placed in fit_loop since global_step tracks steps across epochs
and the fit_loop is responsible for deciding when to stop the entire fit
process. Moving it would require additional flag variables, making the code
more complex without any benefit.

Co-authored-by: Borda <6035284+Borda@users.noreply.github.com>
Copilot AI changed the title [WIP] Remove an unnecessary TODO in src/lightning/pytorch/loops/fit_loop.py Remove unnecessary TODO comment in fit_loop.py Oct 4, 2025
Copilot AI requested a review from Borda October 4, 2025 21:12
@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Oct 4, 2025
@Borda Borda marked this pull request as ready for review November 11, 2025 09:45
@github-actions
Copy link
Contributor

github-actions bot commented Nov 11, 2025

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu-guardian success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.

🟢 pytorch_lightning: lit GPU
Check ID Status
pytorch.yml / Lit Job (nvidia/cuda:12.1.1-devel-ubuntu22.04, pytorch, 3.10) success
pytorch.yml / Lit Job (lightning, 3.12) success
pytorch.yml / Lit Job (pytorch, 3.12) success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.

🟢 Benchmarks
Check ID Status
benchmark.yml / Lit Job (fabric) success
benchmark.yml / Lit Job (pytorch) success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.

🟢 pytorch_lightning: Docs
Check ID Status
docs-make (pytorch, doctest) success
docs-make (pytorch, html) success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.

🟢 install
Check ID Status
install-pkg-guardian success

These checks are required after the changes to src/lightning/pytorch/loops/fit_loop.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 70 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79%. Comparing base (b15d394) to head (3097d30).
✅ All tests successful. No failed tests found.

❗ There is a different number of reports uploaded between BASE (b15d394) and HEAD (3097d30). Click for more details.

HEAD has 336 uploads less than BASE
Flag BASE (b15d394) HEAD (3097d30)
cpu 105 27
python3.10 23 6
lightning_fabric 24 0
pytest 54 0
lightning 57 15
python3.12.7 36 9
python3.11 22 6
python3.12 12 3
python 12 3
pytorch2.2.2 5 3
pytest-full 51 27
pytorch2.1 11 6
pytorch_lightning 24 12
pytorch2.5.1 6 3
pytorch2.7 6 3
pytorch2.4.1 6 3
pytorch2.6 6 3
pytorch2.8 6 3
pytorch2.3 5 3
Additional details and impacted files
@@            Coverage Diff            @@
##           master   #21265     +/-   ##
=========================================
- Coverage      87%      79%     -8%     
=========================================
  Files         269      266      -3     
  Lines       23744    23689     -55     
=========================================
- Hits        20572    18671   -1901     
- Misses       3172     5018   +1846     

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pl Generic label for PyTorch Lightning package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove an unnecessary TODO in src/lightning/pytorch/loops/fit_loop.py

1 participant