Skip to content

Commit cd1d576

Browse files
feat(router): Subgraph Timeout Configuration (#541)
Implementation of Timeout in #317 Ref [ROUTER-110](https://linear.app/the-guild/issue/ROUTER-110) Ref ROUTER-151 This also adds subgraphs and all options to traffic_shaping as in Apollo Router. So subgraph specific configuration can be done with subgraphs; Apollo Router -> https://www.apollographql.com/docs/graphos/routing/performance/traffic-shaping#configuration ```yaml traffic_shaping: all: request_timeout: 5s subgraphs: products: request_timeout: expression: | if (.request.operation.kind == "mutation") { "15s" } else { .default } ``` Documentation -> graphql-hive/console#7214 --------- Co-authored-by: Kamil Kisiela <kamil.kisiela@gmail.com>
1 parent a050b8a commit cd1d576

33 files changed

+1026
-259
lines changed

.changeset/asd.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
config: minor
3+
router: minor
4+
executor: minor
5+
---
6+
7+
# Subgraph Request Timeout Feature
8+
9+
Adds support for configurable subgraph request timeouts via the `traffic_shaping` configuration. The `request_timeout` option allows you to specify the maximum time the router will wait for a response from a subgraph before timing out the request. You can set a static timeout (e.g., `30s`) globally or per-subgraph, or use dynamic timeouts with VRL expressions to vary timeout values based on request characteristics. This helps protect your router from hanging requests and enables fine-grained control over how long requests to different subgraphs should be allowed to run.

.gemini/styleguide.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,33 @@ async fn handle(user: &User, req: &Request) -> Result<Response> {
8181
Ok(...)
8282
}
8383

84+
---
85+
86+
## `std::time::Duration` in `router-config` Crate
87+
88+
When using `std::time::Duration` in the `router-config` crate **only**, you **must** add both serde and schemars attributes:
89+
90+
```rust
91+
use std::time::Duration;
92+
93+
#[derive(serde::Serialize, serde::Deserialize)]
94+
struct Config {
95+
#[serde(
96+
deserialize_with = "humantime_serde::deserialize",
97+
serialize_with = "humantime_serde::serialize",
98+
)]
99+
#[schemars(with = "String")]
100+
timeout: Duration,
101+
}
102+
```
103+
104+
- **`#[serde(...)]`** enables human-readable time formats (e.g., `"30s"`, `"1m30s"`) in config files.
105+
- **`#[schemars(with = "String")]`** ensures the JSON schema correctly represents the field as a string, not as a numeric value.
106+
107+
**Important:** This pattern applies **only** to the `router-config` crate.
108+
109+
---
110+
84111
## Releasing
85112

86113
We are using `knope` with changesets for declaring changes. If you detect a new file in a PR under `.changeset/` directory, please confirm the following rules:

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

bin/router/src/pipeline/progressive_override.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ use std::collections::{BTreeMap, HashMap, HashSet};
22

33
use hive_router_config::override_labels::{LabelOverrideValue, OverrideLabelsConfig};
44
use hive_router_plan_executor::{
5-
execution::client_request_details::ClientRequestDetails, utils::expression::compile_expression,
5+
execution::client_request_details::ClientRequestDetails, expressions::CompileExpression,
66
};
77
use hive_router_query_planner::{
88
graph::{PlannerOverrideContext, PERCENTAGE_SCALE_FACTOR},
@@ -135,7 +135,7 @@ impl OverrideLabelsEvaluator {
135135
static_enabled_labels.insert(label.clone());
136136
}
137137
LabelOverrideValue::Expression { expression } => {
138-
let program = compile_expression(expression, None).map_err(|err| {
138+
let program = expression.compile_expression(None).map_err(|err| {
139139
OverrideLabelsCompileError {
140140
label: label.clone(),
141141
error: err.to_string(),

bin/router/src/schema_state.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ pub enum SupergraphManagerError {
5454
PlannerBuilderError(#[from] PlannerError),
5555
#[error("Failed to build authorization: {0}")]
5656
AuthorizationMetadataError(#[from] AuthorizationMetadataError),
57-
#[error("Failed to init executor: {0}")]
57+
#[error(transparent)]
5858
ExecutorInitError(#[from] SubgraphExecutorError),
5959
#[error("Unexpected: failed to load initial supergraph")]
6060
FailedToLoadInitialSupergraph,

docs/README.md

Lines changed: 47 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
|[**override\_subgraph\_urls**](#override_subgraph_urls)|`object`|Configuration for overriding subgraph URLs.<br/>Default: `{}`<br/>||
1717
|[**query\_planner**](#query_planner)|`object`|Query planning configuration.<br/>Default: `{"allow_expose":false,"timeout":"10s"}`<br/>||
1818
|[**supergraph**](#supergraph)|`object`|Configuration for the Federation supergraph source. By default, the router will use a local file-based supergraph source (`./supergraph.graphql`).<br/>||
19-
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"dedupe_enabled":true,"max_connections_per_host":100,"pool_idle_timeout":"50s"}`<br/>||
19+
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"all":{"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"},"max_connections_per_host":100}`<br/>||
2020

2121
**Additional Properties:** not allowed
2222
**Example**
@@ -113,9 +113,11 @@ query_planner:
113113
timeout: 10s
114114
supergraph: {}
115115
traffic_shaping:
116-
dedupe_enabled: true
116+
all:
117+
dedupe_enabled: true
118+
pool_idle_timeout: 50s
119+
request_timeout: 30s
117120
max_connections_per_host: 100
118-
pool_idle_timeout: 50s
119121

120122
```
121123

@@ -1744,7 +1746,7 @@ The path can be either absolute or relative to the router's working directory.
17441746
|Name|Type|Description|Required|
17451747
|----|----|-----------|--------|
17461748
|**path**|`string`|The path to the supergraph file.<br/><br/>Can also be set using the `SUPERGRAPH_FILE_PATH` environment variable.<br/>Format: `"path"`<br/>|yes|
1747-
|[**poll\_interval**](#option1poll_interval)|`object`, `null`|Optional interval at which the file should be polled for changes.<br/>|yes|
1749+
|**poll\_interval**|`string`|Optional interval at which the file should be polled for changes.<br/>If not provided, the file will only be loaded once when the router starts.<br/>|no|
17481750
|**source**|`string`|Constant Value: `"file"`<br/>|yes|
17491751

17501752
**Additional Properties:** not allowed
@@ -1766,11 +1768,11 @@ Loads a supergraph from Hive Console CDN.
17661768
|Name|Type|Description|Required|
17671769
|----|----|-----------|--------|
17681770
|**accept\_invalid\_certs**|`boolean`|Whether to accept invalid TLS certificates when connecting to the Hive Console CDN.<br/>Default: `false`<br/>|no|
1769-
|[**connect\_timeout**](#option2connect_timeout)|`object`|Connect timeout for the Hive Console CDN requests.<br/>Default: `"10s"`<br/>|yes|
1771+
|**connect\_timeout**|`string`|Connect timeout for the Hive Console CDN requests.<br/>Default: `"10s"`<br/>|no|
17701772
|**endpoint**|`string`|The CDN endpoint from Hive Console target.<br/><br/>Can also be set using the `HIVE_CDN_ENDPOINT` environment variable.<br/>|yes|
17711773
|**key**|`string`|The CDN Access Token with from the Hive Console target.<br/><br/>Can also be set using the `HIVE_CDN_KEY` environment variable.<br/>|yes|
1772-
|[**poll\_interval**](#option2poll_interval)|`object`|Interval at which the Hive Console should be polled for changes.<br/>Default: `"10s"`<br/>|yes|
1773-
|[**request\_timeout**](#option2request_timeout)|`object`|Request timeout for the Hive Console CDN requests.<br/>Default: `"1m"`<br/>|yes|
1774+
|**poll\_interval**|`string`|Interval at which the Hive Console should be polled for changes.<br/><br/>Can also be set using the `HIVE_CDN_POLL_INTERVAL` environment variable.<br/>Default: `"10s"`<br/>|no|
1775+
|**request\_timeout**|`string`|Request timeout for the Hive Console CDN requests.<br/>Default: `"1m"`<br/>|no|
17741776
|[**retry\_policy**](#option2retry_policy)|`object`|Interval at which the Hive Console should be polled for changes.<br/>Default: `{"max_retries":10}`<br/>|yes|
17751777
|**source**|`string`|Constant Value: `"hive"`<br/>|yes|
17761778

@@ -1788,132 +1790,99 @@ retry_policy:
17881790
```
17891791

17901792

1791-
<a name="option1poll_interval"></a>
1792-
## Option 1: poll\_interval: object,null
1793-
1794-
Optional interval at which the file should be polled for changes.
1795-
If not provided, the file will only be loaded once when the router starts.
1796-
1797-
1798-
**Properties**
1799-
1800-
|Name|Type|Description|Required|
1801-
|----|----|-----------|--------|
1802-
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
1803-
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
1804-
1805-
**Example**
1806-
1807-
```yaml
1808-
{}
1809-
1810-
```
1793+
<a name="option2retry_policy"></a>
1794+
## Option 2: retry\_policy: object
18111795

1812-
<a name="option2connect_timeout"></a>
1813-
## Option 2: connect\_timeout: object
1796+
Interval at which the Hive Console should be polled for changes.
18141797

1815-
Connect timeout for the Hive Console CDN requests.
1798+
By default, an exponential backoff retry policy is used, with 10 attempts.
18161799

18171800

18181801
**Properties**
18191802

18201803
|Name|Type|Description|Required|
18211804
|----|----|-----------|--------|
1822-
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
1823-
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
1805+
|**max\_retries**|`integer`|The maximum number of retries to attempt.<br/><br/>Retry mechanism is based on exponential backoff, see https://docs.rs/retry-policies/latest/retry_policies/policies/struct.ExponentialBackoff.html for additional details.<br/>Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
18241806

18251807
**Example**
18261808

18271809
```yaml
1828-
10s
1810+
max_retries: 10
18291811
18301812
```
18311813

1832-
<a name="option2poll_interval"></a>
1833-
## Option 2: poll\_interval: object
1834-
1835-
Interval at which the Hive Console should be polled for changes.
1814+
<a name="traffic_shaping"></a>
1815+
## traffic\_shaping: object
18361816

1837-
Can also be set using the `HIVE_CDN_POLL_INTERVAL` environment variable.
1817+
Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.
18381818

18391819

18401820
**Properties**
18411821

18421822
|Name|Type|Description|Required|
18431823
|----|----|-----------|--------|
1844-
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
1845-
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
1824+
|[**all**](#traffic_shapingall)|`object`|The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.<br/>Default: `{"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"}`<br/>||
1825+
|**max\_connections\_per\_host**|`integer`|Limits the concurrent amount of requests/connections per host/subgraph.<br/>Default: `100`<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||
1826+
|[**subgraphs**](#traffic_shapingsubgraphs)|`object`|Optional per-subgraph configurations that will override the default configuration for specific subgraphs.<br/>||
18461827

1828+
**Additional Properties:** not allowed
18471829
**Example**
18481830

18491831
```yaml
1850-
10s
1832+
all:
1833+
dedupe_enabled: true
1834+
pool_idle_timeout: 50s
1835+
request_timeout: 30s
1836+
max_connections_per_host: 100
18511837
18521838
```
18531839

1854-
<a name="option2request_timeout"></a>
1855-
## Option 2: request\_timeout: object
1840+
<a name="traffic_shapingall"></a>
1841+
### traffic\_shaping\.all: object
18561842

1857-
Request timeout for the Hive Console CDN requests.
1843+
The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.
18581844

18591845

18601846
**Properties**
18611847

18621848
|Name|Type|Description|Required|
18631849
|----|----|-----------|--------|
1864-
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
1865-
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
1850+
|**dedupe\_enabled**|`boolean`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>Default: `true`<br/>||
1851+
|**pool\_idle\_timeout**|`string`|Timeout for idle sockets being kept-alive.<br/>Default: `"50s"`<br/>||
1852+
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>Default: `"30s"`<br/>||
18661853

1854+
**Additional Properties:** not allowed
18671855
**Example**
18681856

18691857
```yaml
1870-
1m
1858+
dedupe_enabled: true
1859+
pool_idle_timeout: 50s
1860+
request_timeout: 30s
18711861
18721862
```
18731863

1874-
<a name="option2retry_policy"></a>
1875-
## Option 2: retry\_policy: object
1864+
<a name="traffic_shapingsubgraphs"></a>
1865+
### traffic\_shaping\.subgraphs: object
18761866

1877-
Interval at which the Hive Console should be polled for changes.
1878-
1879-
By default, an exponential backoff retry policy is used, with 10 attempts.
1867+
Optional per-subgraph configurations that will override the default configuration for specific subgraphs.
18801868

18811869

1882-
**Properties**
1870+
**Additional Properties**
18831871

18841872
|Name|Type|Description|Required|
18851873
|----|----|-----------|--------|
1886-
|**max\_retries**|`integer`|The maximum number of retries to attempt.<br/><br/>Retry mechanism is based on exponential backoff, see https://docs.rs/retry-policies/latest/retry_policies/policies/struct.ExponentialBackoff.html for additional details.<br/>Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
1887-
1888-
**Example**
1889-
1890-
```yaml
1891-
max_retries: 10
1892-
1893-
```
1894-
1895-
<a name="traffic_shaping"></a>
1896-
## traffic\_shaping: object
1897-
1898-
Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.
1874+
|[**Additional Properties**](#traffic_shapingsubgraphsadditionalproperties)|`object`|||
18991875

1876+
<a name="traffic_shapingsubgraphsadditionalproperties"></a>
1877+
#### traffic\_shaping\.subgraphs\.additionalProperties: object
19001878

19011879
**Properties**
19021880

19031881
|Name|Type|Description|Required|
19041882
|----|----|-----------|--------|
1905-
|**dedupe\_enabled**|`boolean`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>Default: `true`<br/>||
1906-
|**max\_connections\_per\_host**|`integer`|Limits the concurrent amount of requests/connections per host/subgraph.<br/>Default: `100`<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||
1907-
|**pool\_idle\_timeout**|`string`|Timeout for idle sockets being kept-alive.<br/>Default: `"50s"`<br/>||
1883+
|**dedupe\_enabled**|`boolean`, `null`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>||
1884+
|**pool\_idle\_timeout**|`string`, `null`|Timeout for idle sockets being kept-alive.<br/>||
1885+
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>||
19081886

19091887
**Additional Properties:** not allowed
1910-
**Example**
1911-
1912-
```yaml
1913-
dedupe_enabled: true
1914-
max_connections_per_host: 100
1915-
pool_idle_timeout: 50s
1916-
1917-
```
1918-
19191888

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# yaml-language-server: $schema=../../router-config.schema.json
2+
supergraph:
3+
source: file
4+
path: ../supergraph.graphql
5+
traffic_shaping:
6+
all:
7+
request_timeout: 2s
8+
# Disable deduplication to better hunt for deadlocks in tests
9+
dedupe_enabled: false
10+
subgraphs:
11+
accounts:
12+
request_timeout:
13+
expression: |
14+
if (.request.headers."x-timeout" == "short") {
15+
"10s"
16+
} else {
17+
.default
18+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# yaml-language-server: $schema=../../router-config.schema.json
2+
supergraph:
3+
source: file
4+
path: ../supergraph.graphql
5+
traffic_shaping:
6+
all:
7+
request_timeout: 2s
8+
# Disable deduplication to better hunt for deadlocks in tests
9+
dedupe_enabled: false
10+
subgraphs:
11+
accounts:
12+
request_timeout: 5s

e2e/src/lib.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,5 @@ mod probes;
1616
mod supergraph;
1717
#[cfg(test)]
1818
mod testkit;
19+
#[cfg(test)]
20+
mod timeout_per_subgraph;

0 commit comments

Comments
 (0)