Skip to content

Commit e7e5c93

Browse files
elevranrishi-jat
authored andcommitted
Add explanation of inference-scheduler relation to IGW/GIE (#393)
* elaborate relation to IGW/GIE Signed-off-by: Etai Lev Ran <elevran@gmail.com> * coalesce sections on relation to GIE Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com>
1 parent 4dc254d commit e7e5c93

File tree

1 file changed

+20
-9
lines changed

1 file changed

+20
-9
lines changed

README.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,30 @@ the llm-d inference framework.
1212

1313
This provides an "Endpoint Picker (EPP)" component to the llm-d inference
1414
framework which schedules incoming inference requests to the platform via a
15-
[Kubernetes] Gateway according to scheduler plugins. For more details on the llm-d inference scheduler architecture, routing logic, and different plugins (filters and scorers), including plugin configuration, see the [Architecture Documentation]).
15+
[Kubernetes] Gateway according to scheduler plugins. For more details on the
16+
llm-d inference scheduler architecture, routing logic, and different plugins
17+
(filters and scorers), including plugin configuration, see the [Architecture Documentation]).
18+
19+
### Relation to GIE (IGW)
1620

1721
The EPP extends the [Gateway API Inference Extension (GIE)] project,
1822
which provides the API resources and machinery for scheduling. We add some
1923
custom features that are specific to llm-d here, such as [P/D Disaggregation].
24+
The two projects collaborate closely as often a feature in llm-d might require
25+
enablement and extensions in the GIE code base.
26+
Unique and experimental features may start in llm-d and migrate, over time, to
27+
GIE. As a project goal, we prefer to upstream functionality to GIE when
28+
- it has matured sufficiently and has proven wide applicability and usefulness; and
29+
- it can be implemented in EPP alone (i.e., llm-d provides a full inference framework,
30+
beyond scheduling).
31+
32+
Note that in general features should go to the upstream [Gateway API Inference
33+
Extension (GIE)] project _first_ if applicable. The GIE is a major dependency of
34+
ours, and where most _general purpose_ inference features live. If you have
35+
something that you feel is general purpose or use, it probably should go to the
36+
GIE. If you have something that's _llm-d specific_ then it should go here. If
37+
you're not sure whether your feature belongs here or in the GIE, feel free to
38+
create a [discussion] or ask on [Slack].
2039

2140
A compatible [Gateway API] implementation is used as the Gateway. The Gateway
2241
API implementation must utilize [Envoy] and support [ext-proc], as this is the
@@ -41,14 +60,6 @@ For large changes please [create an issue] first describing the change so the
4160
maintainers can do an assessment, and work on the details with you. See
4261
[DEVELOPMENT.md](DEVELOPMENT.md) for details on how to work with the codebase.
4362

44-
Note that in general features should go to the upstream [Gateway API Inference
45-
Extension (GIE)] project _first_ if applicable. The GIE is a major dependency of
46-
ours, and where most _general purpose_ inference features live. If you have
47-
something that you feel is general purpose or use, it probably should go to the
48-
GIE. If you have something that's _llm-d specific_ then it should go here. If
49-
you're not sure whether your feature belongs here or in the GIE, feel free to
50-
create a [discussion] or ask on [Slack].
51-
5263
Contributions are welcome!
5364

5465
[create an issue]:https://github.com/llm-d/llm-d-inference-scheduler/issues/new

0 commit comments

Comments
 (0)