Skip to content

Conversation

@yuejiaointel
Copy link
Contributor

Description

Dev doc is triggered when PR from main is merged.
Release doc is just cp existing content from dev doc to release version doc


Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
  • I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
  • I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
azure 80.47% <ø> (ø)
github 82.07% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@david-cortes-intel
Copy link
Contributor

A couple questions here:

  • In the case of the job that releases from branch, I see it tries to fetch a 'dev' version from $TEMP_DOC_FOLDER, but since those dev versions are not committed to the storage branch (and they aren't supposed to, otherwise they'll take up too much space), where would they be pulled from?
  • I see in this case version 'dev' would be placed before version 'latest' in the switcher menu, which is desirable, but: will the unversioned link (https://uxlfoundation.github.io/scikit-learn-intelex) still redirect to 'latest'?

@yuejiaointel
Copy link
Contributor Author

yuejiaointel commented Oct 29, 2025

really good questions!

  1. Dev doc will be archived/pulled to doc_archive branch. Every main merge will overwrite dev folder so should not take too much space.
  2. currently, the behavior is it will redirect to newest version, changed in build-doc.sh to redirect to latest instead of doc_version from home link.

Entire planned workflow (any suggestions is welcomed!)
dev mode: trigger by merge to main branch

  1. Set environment variables: IS_DEV_BUILD=true, SHORT_DOC_VERSION=dev
  2. Build documentation in doc/_build/scikit-learn-intelex/dev/
  3. Sync existing versions from doc_archive and gh-pages to temp folder
  4. Update dev folder with newly built docs
  5. Generate versions.json in order: dev, latest, 2024.9, etc.
  6. Archive dev to doc_archive branch
  7. Deploy to GitHub Pages

release mode: trigger by tag push

  1. Set environment variables: IS_DEV_BUILD=false, SHORT_DOC_VERSION=2024.10
  2. Checkout the tag's code
  3. Skip building (no code build, no doc build)
  4. Sync existing versions from doc_archive and gh-pages to temp folder (gets /dev/)
  5. Copy /dev/ → /2024.10/ and /dev/ → /latest/
  6. Generate versions.json in order: dev, latest, 2024.10, 2024.9, etc.
  7. Archive /2024.10/ to doc_archive branch
  8. Deploy to GitHub Pages

@david-cortes-intel
Copy link
Contributor

really good questions!

  1. Dev doc will be archived/pulled to doc_archive branch. Every main merge will overwrite dev folder so should not take too much space.
  2. currently, the behavior is it will redirect to newest version, changed in build-doc.sh to redirect to latest instead of doc_version from home link.

Entire planned workflow (any suggestions is welcomed!) dev mode: trigger by merge to main branch

  1. Set environment variables: IS_DEV_BUILD=true, SHORT_DOC_VERSION=dev
  2. Build documentation in doc/_build/scikit-learn-intelex/dev/
  3. Sync existing versions from doc_archive and gh-pages to temp folder
  4. Update dev folder with newly built docs
  5. Generate versions.json in order: dev, latest, 2024.9, etc.
  6. Archive dev to doc_archive branch
  7. Deploy to GitHub Pages

release mode: trigger by tag push

  1. Set environment variables: IS_DEV_BUILD=false, SHORT_DOC_VERSION=2024.10
  2. Checkout the tag's code
  3. Skip building (no code build, no doc build)
  4. Sync existing versions from doc_archive and gh-pages to temp folder (gets /dev/)
  5. Copy /dev/ → /2024.10/ and /dev/ → /latest/
  6. Generate versions.json in order: dev, latest, 2024.10, 2024.9, etc.
  7. Archive /2024.10/ to doc_archive branch
  8. Deploy to GitHub Pages

Got it. But for point 6) in the 1st part, since these docs take ~60mb each and we can expect a few thousands commits per year, it sounds like it will quickly start growing too much in size, which will make further downloads slower and might hit off size limits from github (not sure which tier we have).

Is there some other way to handle the development docs that wouldn't result in as much space being consumed? For example, could they be deployed directly to github pages without storing them on the triggers from merges to main, and then downloaded from the current github pages (not git branch) on the triggers from releases?

@david-cortes-intel
Copy link
Contributor

Or alternatively, since there would be regular pushes to the main branch either way, could the release doc just store them in the archive branch without deploying, and then the merge-to-main job pull from that branch and add the current development version?

@yuejiaointel
Copy link
Contributor Author

really good questions!

  1. Dev doc will be archived/pulled to doc_archive branch. Every main merge will overwrite dev folder so should not take too much space.
  2. currently, the behavior is it will redirect to newest version, changed in build-doc.sh to redirect to latest instead of doc_version from home link.

Entire planned workflow (any suggestions is welcomed!) dev mode: trigger by merge to main branch

  1. Set environment variables: IS_DEV_BUILD=true, SHORT_DOC_VERSION=dev
  2. Build documentation in doc/_build/scikit-learn-intelex/dev/
  3. Sync existing versions from doc_archive and gh-pages to temp folder
  4. Update dev folder with newly built docs
  5. Generate versions.json in order: dev, latest, 2024.9, etc.
  6. Archive dev to doc_archive branch
  7. Deploy to GitHub Pages

release mode: trigger by tag push

  1. Set environment variables: IS_DEV_BUILD=false, SHORT_DOC_VERSION=2024.10
  2. Checkout the tag's code
  3. Skip building (no code build, no doc build)
  4. Sync existing versions from doc_archive and gh-pages to temp folder (gets /dev/)
  5. Copy /dev/ → /2024.10/ and /dev/ → /latest/
  6. Generate versions.json in order: dev, latest, 2024.10, 2024.9, etc.
  7. Archive /2024.10/ to doc_archive branch
  8. Deploy to GitHub Pages

Got it. But for point 6) in the 1st part, since these docs take ~60mb each and we can expect a few thousands commits per year, it sounds like it will quickly start growing too much in size, which will make further downloads slower and might hit off size limits from github (not sure which tier we have).

Is there some other way to handle the development docs that wouldn't result in as much space being consumed? For example, could they be deployed directly to github pages without storing them on the triggers from merges to main, and then downloaded from the current github pages (not git branch) on the triggers from releases?

I still don't understand why it will consume a lot of space, since from every main merge we are just replacing the current dev docs. We could have a lot of commits but we are not storing each dev doc commit, just the latest main dev doc. We are also limiting the trigger on doc related changes, if a PR merges to main but does not change any doc, doc-release.yml will not be triggered either.

@david-cortes-intel
Copy link
Contributor

We could have a lot of commits but we are not storing each dev doc commit, just the latest main dev doc commit

Yes, but those would each be on a separate commit in the storage branch. And since git is not based off file differences, that would quickly start growing in size by a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants