Guided Flow Policy:
Learning from High-Value Actions in Offline Reinforcement Learning

ArXiv HAL Webpage

Guided Flow Policy (GFP) is an offline RL method based on flow matching. It couples a multi-step flow-matching policy trained with value-aware behavior cloning and a distilled one-step actor through a bidirectional guidance mechanism. This enables GFP to achieve state-of-the-art performance across 144 state and pixel-based tasks from the OGBench, Minari, and D4RL benchmarks, with substantial gains on suboptimal datasets and challenging tasks.

News & Updates

🟢 2025-12-03 - Release of the paper on ArXiv
🔴 Code, coming soon
🔴 Detailed blog post, coming soon

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Guided Flow Policy:
Learning from High-Value Actions in Offline Reinforcement Learning

ArXiv HAL Webpage

News & Updates

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Simple-Robotics/guided-flow-policy

Folders and files

Latest commit

History

Repository files navigation

Guided Flow Policy: Learning from High-Value Actions in Offline Reinforcement Learning

ArXiv HAL Webpage

News & Updates

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Guided Flow Policy:
Learning from High-Value Actions in Offline Reinforcement Learning

Packages