|
3 | 3 | =================================== |
4 | 4 | Undestanding job polling in ReFrame |
5 | 5 | =================================== |
| 6 | + |
| 7 | + |
| 8 | +ReFrame executes the "compile" and "run" phases of the :doc:`test pipeline <pipeline>` by spawning "jobs" that will build and execute the test, respectively. |
| 9 | +A job may be a simple local process or a batch job submitted to a job scheduler, such as Slurm. |
| 10 | + |
| 11 | +ReFrame monitors the progress of its spawned jobs through polling. |
| 12 | +It does so in a careful way to avoid overloading the software infrastructure of the job scheduler. |
| 13 | +For example, it will try to poll the status of all its pending jobs at once using a single job scheduler command. |
| 14 | + |
| 15 | +ReFrame adjusts its polling rate dynamically using an exponential decay function to ensure both high interactivity and low load. |
| 16 | +Polling starts at a high rate and -- in absence of any job status changes -- it gradually decays to a minimum value. |
| 17 | +After this point the polling rate remains constant. |
| 18 | +However, whenever a job completes, ReFrame resets its polling rate to the maximum, so as to quickly reap any jobs that are finishing at a close time. |
| 19 | + |
| 20 | +The following figure shows the instant polling rates (desired and current) as well as the global one from the beginning of the run loop. |
| 21 | +The workload is a series of 6 tests, where the i-th test sleeps for ``10*i`` seconds. |
| 22 | + |
| 23 | +.. figure:: _static/img/polling-rates.svg |
| 24 | + :align: center |
| 25 | + |
| 26 | + :sub:`Instant and global polling rates of ReFrame as it executes a workload of six tests that sleep different amount of time. The default polling settings are used (poll_rate_max=10, poll_rate_min=0.1, poll_rate_decay=0.1)` |
| 27 | + |
| 28 | +Note how ReFrame resets the instant polling rate whenever a test job finishes. |
| 29 | + |
| 30 | +Users can control the maximum and minimum instant polling rates as well as the polling rate decay through either :ref:`environment variables <polling_envvars>` or :ref:`configuration parameters <polling_config>`. |
| 31 | + |
| 32 | + |
| 33 | +Polling randomization |
| 34 | +--------------------- |
| 35 | + |
| 36 | +If multiple ReFrame processes execute the same workload at the same time, then the aggregated poll rate can be quite high, potentially stressing the batch scheduler infrastructure. |
| 37 | +The following picture shows the histogram of polls when running concurrently 10 ReFrame processes, each one of them executing a series of 6 tests with varying sleep times (see above): |
| 38 | + |
| 39 | + |
| 40 | +.. figure:: _static/img/polling-multiproc-default.svg |
| 41 | + :align: center |
| 42 | + |
| 43 | + :sub:`Poll count histogram of 10 ReFrame processes running concurrently the same workload. Each histogram bin corresponds to a second.` |
| 44 | + |
| 45 | +Note how the total polling rate can significantly exceed the maximum polling rate set in each reframe process. |
| 46 | + |
| 47 | +One option would be to reduce the maximum polling rate of every process, so that their aggregation falls below a certain threshold. |
| 48 | +Alternatively, you can instruct ReFrame to randomize the polling interval duration. |
| 49 | +This has a less drastic effect compared to reducing the maximum polling rate, but it keeps the original polling characteristics, smoothening out the spikes. |
| 50 | + |
| 51 | +The following figure shows poll histogram by setting ``RFM_POLL_RANDOMIZE=-500,1500``. |
| 52 | +This allows ReFrame to reduce randomly the polling interval up to 500ms or extend it up to 1500ms. |
| 53 | + |
| 54 | +.. figure:: _static/img/polling-multiproc-randomize.svg |
| 55 | + :align: center |
| 56 | + |
| 57 | + :sub:`Poll count histogram of 10 ReFrame processes executing the same workload using polling interval randomization. Each histogram bin corresponds to a second.` |
| 58 | + |
| 59 | +Note how the spikes are now not so pronounced and polls are better distributed across time. |
0 commit comments