Skip to content

Conversation

@listar2000
Copy link
Contributor

This is a tentative fix to the issue #291: where the weight sync between rollout engine and trainer engine is completely missing (an obvious issue that verl also hasn't fixed yet) when config actor_rollout_ref.rollout.free_cache_engine=False.

Fortunately this seems fixable without patching verl as in rLLM we implement our own rollout trajectory generation logic. In fact, the rollout_engine.wake_up()/sleep() method should be run regardless of the free_cache_engine flag -- the flag will be checked again anyway during the wake_up/sleep call anyway. For instance, the release of kv-cache in the inference engine will do an extra check here.

This PR should not affect the usual case when free_cache_engine = True at all. Some experiments are on-going to see if there is any side-effect when free_cache_engine = False (so this PR is still under draft)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant