feat: add threaded I/O pipeline for video processing #1997

AnonymDevOSS · 2025-10-26T22:20:16Z

Implements pipeline with bounded queues to overlap decode, compute and encode. Reduces I/O stalls.

Description

Process a video using a threaded pipeline that asynchronously
reads frames, applies a callback to each, and writes the results
to an output file.

This function implements a three-stage pipeline designed to maximize
frame throughput.

Reader thread: reads frames from disk into a bounded queue ('read_q')
until full, then blocks. This ensures we never load more than 'prefetch'
frames into memory at once.
Main thread: dequeues frames, applies the 'callback(frame, idx)',
and enqueues the processed result into 'write_q'.
This is the compute stage. It's important to note that it's not threaded,
so you can safely use any detectors, trackers, or other stateful objects
without synchronization issues.
Writer thread: dequeues frames and writes them to disk.

Both queues are bounded to enforce back-pressure:

The reader cannot outpace processing
The processor cannot outpace writing

Summary:

It's thread-safe: because the callback runs only in the main thread,
using a single stateful detector/tracker inside callback does not require
synchronization with the reader/writer threads.
While the main thread processes frame N, the reader is already decoding frame N+1,
and the writer is encoding frame N-1. They operate concurrently without blocking
each other.

Type of change

Please delete options that are not relevant.

[x ] New feature (non-breaking change which adds functionality)
[x ] This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

I created a benchmark script to measure the performance impact of these changes. (benchmark_process_video.py;
full_results.txt)

I created 3 functions to benchmark: opencv (short), opencv (long), tracker; benchmarking current process_video and new process_video_threads.

Results below, 5 executions for each case:

OpenCV (short)
process_video_threads       avg=  2.338s  stdev= 0.200s  min= 2.137s  max= 2.620s
process_video               avg=  3.560s  stdev= 0.404s  min= 3.219s  max= 4.249s

OpenCV (long) 
process_video_threads       avg= 18.449s  stdev= 1.426s  min=17.067s  max=20.863s
process_video               avg= 28.067s  stdev= 1.345s  min=26.261s  max=29.373s

Tracker.
process_video_threads       avg= 21.481s  stdev= 0.593s  min=20.825s  max=22.205s
process_video               avg= 24.929s  stdev= 0.368s  min=24.464s  max=25.307s

Initially, I explored using threads and processes to parallelize process_video (I can push some of those prototypes if needed), but this design wasn’t thread-safe for stateful callbacks (e.g. trackers) and showed little improvement in profiling; most of the total time was spent on disk I/O rather than computation.

This optimization instead focuses on improving the I/O path, yielding a more generic and safe performance gain.

CLAassistant · 2025-10-26T22:20:23Z

All committers have signed the CLA.

Implements pipeline with bounded queues to overlap decode, compute and encode. Reduces I/O stalls.

Ashp116 · 2025-11-05T02:25:10Z

Hey @AnonymDevOSS,

This PR pretty good. I think there are plans to create new video API #1924. What are your thoughts on adding threading for the new Video API?

AnonymDevOSS · 2025-11-05T07:53:55Z

Sure, I'll take a look at it this week.

AnonymDevOSS requested a review from SkalskiP as a code owner October 26, 2025 22:20

AnonymDevOSS force-pushed the feat/threads-queues-video-process branch from 4c2a8d0 to 90d47de Compare October 28, 2025 12:56

AnonymDevOSS added 2 commits October 28, 2025 14:05

feat: add threaded I/O pipeline for video processing

98476df

Implements pipeline with bounded queues to overlap decode, compute and encode. Reduces I/O stalls.

feat: tests for add threaded I/O pipeline for video processing

03f6239

AnonymDevOSS force-pushed the feat/threads-queues-video-process branch from 90d47de to 03f6239 Compare October 28, 2025 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add threaded I/O pipeline for video processing #1997

feat: add threaded I/O pipeline for video processing #1997

Uh oh!

AnonymDevOSS commented Oct 26, 2025

Uh oh!

CLAassistant commented Oct 26, 2025 •

edited

Loading

Uh oh!

Ashp116 commented Nov 5, 2025

Uh oh!

AnonymDevOSS commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add threaded I/O pipeline for video processing #1997

Are you sure you want to change the base?

feat: add threaded I/O pipeline for video processing #1997

Uh oh!

Conversation

AnonymDevOSS commented Oct 26, 2025

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Uh oh!

CLAassistant commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ashp116 commented Nov 5, 2025

Uh oh!

AnonymDevOSS commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Oct 26, 2025 •

edited

Loading