Skip to content

Commit b61286e

Browse files
authored
Merge pull request #98450 from bergerhoffer/OSDOCS-15493
OSDOCS#15493: New AI workloads book and LWS docs
2 parents c20cb5b + fe0bb93 commit b61286e

25 files changed

+484
-0
lines changed

_attributes/common-attributes.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@ endif::[]
7878
:secondary-scheduler-operator: Secondary Scheduler Operator
7979
:descheduler-operator: Kube Descheduler Operator
8080
:cli-manager: CLI Manager Operator
81+
:lws-operator: Leader Worker Set Operator
82+
:kueue-prod-name: Red{nbsp}Hat build of Kueue
8183
// Backup and restore
8284
:launch: image:app-launcher.png[title="Application Launcher"]
8385
:mtc-first: Migration Toolkit for Containers (MTC)

_topic_maps/_topic_map.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3424,6 +3424,25 @@ Topics:
34243424
File: node-observability-operator
34253425
Distros: openshift-origin,openshift-enterprise
34263426
---
3427+
Name: AI workloads
3428+
Dir: ai_workloads
3429+
Distros: openshift-enterprise
3430+
Topics:
3431+
- Name: Overview of AI workloads on OpenShift Container Platform
3432+
File: index
3433+
- Name: Leader Worker Set Operator
3434+
Dir: leader_worker_set
3435+
Distros: openshift-enterprise
3436+
Topics:
3437+
- Name: Leader Worker Set Operator overview
3438+
File: index
3439+
- Name: Leader Worker Set Operator release notes
3440+
File: lws-release-notes
3441+
- Name: Managing distributed workloads with the Leader Worker Set Operator
3442+
File: lws-managing
3443+
- Name: Uninstalling the Leader Worker Set Operator
3444+
File: lws-uninstalling
3445+
---
34273446
Name: Edge computing
34283447
Dir: edge_computing
34293448
Distros: openshift-origin,openshift-enterprise

ai_workloads/_attributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../_attributes/

ai_workloads/images

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../images/

ai_workloads/index.adoc

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="ai-workloads-about"]
4+
= Overview of AI workloads on {product-title}
5+
6+
:context: ai-workloads-about
7+
8+
toc::[]
9+
10+
{product-title} provides a secure, scalable foundation for running artificial intelligence (AI) workloads across training, inference, and data science workflows.
11+
12+
// Operators for running AI workloads
13+
include::modules/ai-operators.adoc[leveloffset=+1]
14+
15+
[role="_additional-resources"]
16+
.Additional resources
17+
18+
* xref:../ai_workloads/leader_worker_set/index.adoc#lws-about[{lws-operator} overview]
19+
20+
// Exclude this for now until we can get it reviewed by the RHOAI team
21+
// {rhoai-full}
22+
// include::modules/ai-rhoai.adoc[leveloffset=+1]
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../_attributes/
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../images/
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="lws-about"]
4+
= {lws-operator} overview
5+
6+
:context: lws-about
7+
8+
toc::[]
9+
10+
Using large language models (LLMs) for AI/ML inference often requires significant compute resources, and workloads typically must be sharded across multiple nodes. This can make deployments complex, creating challenges around scaling, recovery from failures, and efficient pod placement.
11+
12+
The {lws-operator} simplifies these multi-node deployments by treating a group of pods as a single, coordinated unit. It manages the lifecycle of each pod in the group, scales the entire group together, and performs updates and failure recovery at the group level to ensure consistency.
13+
14+
// About the {lws-operator}
15+
include::modules/lws-about.adoc[leveloffset=+1]
16+
17+
// LeaderWorkerSet architecture
18+
include::modules/lws-arch.adoc[leveloffset=+2]
19+
20+
[role="_additional-resources"]
21+
[id="lws-about_additional-resources"]
22+
== Additional resources
23+
24+
* link:https://lws.sigs.k8s.io/docs/overview/[LeaderWorkerSet documentation (Kubernetes)]
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="lws-managing"]
4+
= Managing distributed workloads with the {lws-operator}
5+
6+
:context: lws-managing
7+
8+
toc::[]
9+
10+
You can use the {lws-operator} to manage distributed inference workloads and process large-scale inference requests efficiently.
11+
12+
// Installing the {lws-operator}
13+
include::modules/lws-install-operator.adoc[leveloffset=+1]
14+
15+
// Deploying a leader worker set
16+
include::modules/lws-config.adoc[leveloffset=+1]
17+
18+
[role="_additional-resources"]
19+
[id="lws-managing_additional-resources"]
20+
== Additional resources
21+
22+
* link:https://lws.sigs.k8s.io/docs/reference/leaderworkerset.v1/[LeaderWorkerSet API (Kubernetes)]
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="lws-release-notes"]
4+
= {lws-operator} release notes
5+
6+
:context: lws-release-notes
7+
8+
toc::[]
9+
10+
You can use the {lws-operator} to manage distributed inference workloads and process large-scale inference requests efficiently.
11+
12+
These release notes track the development of the {lws-operator}.
13+
14+
For more information, see xref:../../ai_workloads/leader_worker_set/index.adoc#lws-about_lws-about[About the {lws-operator}].
15+
16+
// Release notes for Leader Worker Set Operator 1.0.0
17+
include::modules/lws-rn-1.0.0.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)