-
Notifications
You must be signed in to change notification settings - Fork 1
Add support to validate canonical jobspec and walk resources #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,3 +41,83 @@ Validation failed at directives: | |
| --noodles=2: 2 | ||
| Sep 09 06:48:51.615419 UTC 2025 broker.err[0]: rc2.0: python3 /code/docker/flux-validator/validate.py validate /data/docker/flux-validator/batch-invalid.sh Exited (rc=1) 0.1s | ||
| ``` | ||
|
|
||
| #### Canonical jobspecs in YAML or JSON format | ||
|
|
||
| ##### Valid | ||
| ```bash | ||
| $ docker run -it -v $(pwd):/data ghcr.io/compspec/fractale:flux-validator /data/docker/flux-validator/implicit-slot.yaml | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add |
||
| $ echo $? | ||
| ``` | ||
|
|
||
| ##### Invalid | ||
| ```bash | ||
| $ docker run -it -v $(pwd):/data ghcr.io/compspec/fractale:flux-validator /data/docker/flux-validator/implicit-slot-invalid.yaml | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add |
||
| Traceback (most recent call last): | ||
| File "/code/docker/flux-validator/validate.py", line 113, in <module> | ||
| run_command() | ||
| File "/code/docker/flux-validator/validate.py", line 75, in run_command | ||
| return validate(args.path) | ||
| File "/code/docker/flux-validator/validate.py", line 99, in validate | ||
| jobspec = validate_jobspec(json_content) | ||
| File "/usr/lib/python3.10/site-packages/flux/job/Jobspec.py", line 131, in validate_jobspec | ||
| jobspec = Jobspec(**jobspec_obj) | ||
| File "/usr/lib/python3.10/site-packages/flux/job/Jobspec.py", line 198, in __init__ | ||
| self._validate_resource(res) | ||
| File "/usr/lib/python3.10/site-packages/flux/job/Jobspec.py", line 306, in _validate_resource | ||
| raise ValueError("slots must have labels") | ||
| ValueError: slots must have labels | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is the output going to an agent, a few thoughts to consider:
I am also getting the exit of the broker for the output: Nov 03 07:44:12.177820 UTC 2025 broker.err[0]: rc2.0: python3 /code/docker/flux-validator/validate.py validate /data/docker/flux-validator/implicit-slot.yaml Exited (rc=1) 0.1s
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that it validates when I have a label but I change the name (e.g., default is defined, but then in the resources I called it something else). I don't know if flux checks for that. |
||
| ``` | ||
|
|
||
| ##### Validate counts | ||
| Note: need to override the entrypoint. | ||
|
|
||
| ```bash | ||
| $ docker run --entrypoint flux -it -v $(pwd):/data ghcr.io/compspec/fractale:flux-validator start python3 /code/docker/flux-validator/validate.py count /data/docker/flux-validator/implicit-slot.yaml | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one is cool! One, two, three, core... ah ah ah. I am the count, I love to count! 🦇 |
||
| Type: node, count: 1 | ||
| Type: memory, count: 256 | ||
| Type: socket, count: 2 | ||
| Type: gpu, count: 8 | ||
| Type: slot, count: 4 | ||
| Type: L3cache, count: 4 | ||
| Type: core, count: 16 | ||
| Type: pu, count: 16 | ||
| ``` | ||
|
|
||
| Where `implicit-slot.yaml` has the following content: | ||
| ```yaml | ||
| version: 9999 | ||
| resources: | ||
| - type: node | ||
| count: 1 | ||
| with: | ||
| - type: memory | ||
| count: 256 | ||
| - type: socket | ||
| count: 2 | ||
| with: | ||
| - type: gpu | ||
| count: 4 | ||
| - type: slot | ||
| count: 2 | ||
| label: default | ||
| with: | ||
| - type: L3cache | ||
| count: 1 | ||
| with: | ||
| - type: core | ||
| count: 4 | ||
| with: | ||
| - type: pu | ||
| count: 1 | ||
|
|
||
| # a comment | ||
| attributes: | ||
| system: | ||
| duration: 3600 | ||
| tasks: | ||
| - command: [ "app" ] | ||
| slot: default | ||
| count: | ||
| per_slot: 1 | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,12 +2,16 @@ | |
|
|
||
| import argparse | ||
| import sys | ||
| import yaml | ||
| import json | ||
|
|
||
| from rich import box | ||
| from rich.console import Console | ||
| from rich.padding import Padding | ||
| from rich.panel import Panel | ||
|
|
||
| from flux.job.Jobspec import validate_jobspec | ||
|
|
||
| import fractale.utils as utils | ||
|
|
||
| # This will pretty print all exceptions in rich | ||
|
|
@@ -48,10 +52,17 @@ def get_parser(): | |
| description="validate flux batch script", | ||
| ) | ||
| validate.add_argument("path", help="path to batch.sh to validate") | ||
|
|
||
| count = subparsers.add_parser( | ||
| "count", | ||
| formatter_class=argparse.RawTextHelpFormatter, | ||
| description="count resources in flux batch script", | ||
| ) | ||
| count.add_argument("path", help="path to batch.yaml to count resources") | ||
| return parser | ||
|
|
||
|
|
||
| def run_validate(): | ||
| def run_command(): | ||
| parser = get_parser() | ||
| if len(sys.argv) == 1: | ||
| help() | ||
|
|
@@ -62,22 +73,41 @@ def run_validate(): | |
| # Here we can assume instantiated to get args | ||
| if args.command == "validate": | ||
| return validate(args.path) | ||
| elif args.command == "count": | ||
| return count_resources(args.path) | ||
| raise ValueError(f"The command {args.command} is not known") | ||
|
|
||
|
|
||
| def validate(path): | ||
| """ | ||
| Validate the path to a batch.sh or similar. | ||
| """ | ||
| validator = Validator("batch") | ||
| jobspec = None | ||
| content = utils.read_file(path) | ||
| try: | ||
| # Setting fail fast to False means we will get ALL errors at once | ||
| validator.validate(path, fail_fast=False) | ||
| except Exception as e: | ||
| display_error(content, str(e)) | ||
| sys.exit(1) | ||
| yaml_content = yaml.safe_load(content) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: run |
||
| json_content = json.dumps(yaml_content) | ||
| except Exception: | ||
| validator = Validator("batch") | ||
| try: | ||
| # Setting fail fast to False means we will get ALL errors at once | ||
| validator.validate(path, fail_fast=False) | ||
| except Exception as e: | ||
| display_error(content, str(e)) | ||
| sys.exit(1) | ||
| else: | ||
| jobspec = validate_jobspec(json_content) | ||
| return jobspec | ||
|
|
||
|
|
||
| def count_resources(path): | ||
| """ | ||
| Count the resources in the path to a batch.yaml or similar. | ||
| """ | ||
| jobspec = validate(path) | ||
| for res in jobspec[1].resource_walk(): | ||
| print(f"Type: {res[1]['type']}, count: {res[2]}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| run_validate() | ||
| run_command() | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To follow the structure above, let's put this directly as another example under Valid. A comment that it is for a canonical jobspec in json/yaml will suffice to categorize it.