Skip to content

Conversation

@aryasoni98
Copy link
Contributor

Description

This PR implements automated PDF generation for OpenSearch documentation guides, addressing user requests for downloadable PDF versions of developer guides and other documentation sections.

Key Features:

  • Automated PDF Generation: Uses Puppeteer to convert HTML documentation pages to PDF format
  • Collection-Based Organization: Generates separate PDFs for major documentation sections (Developer Guide, Getting Started, API Reference, etc.)
  • CI/CD Integration: Separate GitHub Actions workflow that runs weekly and can be triggered manually, without impacting main build time
  • Optimized Implementation: Reads directly from built _site/ directory using file:// protocol (no server required)
  • Configurable: Easy to add/remove collections via pdf-config.json

Implementation Details:

  • Added generate-pdfs.js script (root level, following existing script patterns)
  • Added package.json with Puppeteer dependency
  • Added pdf-config.json configuration file
  • Created .github/workflows/generate-pdfs.yml CI workflow
  • Updated DEVELOPER_GUIDE.md with PDF generation documentation
  • Updated .gitignore to exclude generated PDFs and node_modules

Benefits:

  1. User-Friendly Search: Users can download complete guides and search across all content using Ctrl+F in a single document
  2. AI Integration: PDFs can be uploaded to AI tools (ChatGPT, NotebookLLM, etc.) for knowledge summarization and question-answering
  3. Offline Access: Complete documentation available for offline use
  4. Non-Blocking: PDF generation runs separately from main build, ensuring no impact on documentation deployment time

Issues Resolved

Closes #11192

Version

all

Frontend features

N/A - This is a backend/infrastructure change for PDF generation. No frontend UI changes are included in this PR.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developer's Certificate of Origin.

For more information on following the Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

aryasoni98 and others added 21 commits November 11, 2025 17:01
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
)

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* updating the logstash migration example

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* removing the migration from logstash page

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…earch-project#11459)

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…earch-project#11465)

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
)

* Add Polish and Ukranian analyzer documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* Add missing PPL settings

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…1404)

* Bump docs to 3.3.2 version with OS updates only

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

* Update 3.3.2 releaseinfo

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

* Update _about/version-history.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

* Update plugin entries

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

---------

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
)

* adding examples to http source of data prepper

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update http.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Apply suggestions from code review

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* adding file source page

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* fixing valke errors

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update file.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/sources/file.md

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…11462)

* Add explain filtering functionality for ISM docs

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _im-plugin/ism/api.md

Co-authored-by: bowenlan-amzn <bowenlan23@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Make response hidden

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _im-plugin/ism/api.md

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: bowenlan-amzn <bowenlan23@gmail.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…opensearch-project#11482)

* Update documentation for arrays that semantic field cannot support it

Signed-off-by: Bo Zhang <bzhangam@amazon.com>

* Update _mappings/supported-field-types/index.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Bo Zhang <bzhangam@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* expanding example for split string processor

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update split-string.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* expanding on routes example

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* expanding on routes example

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update pipelines.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Update pipelines.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* Add field masking search limitation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove redundancy

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _security/access-control/field-masking.md

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…ct#11282)

* adding example to write json processor data prepper

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update write-json.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Update _data-prepper/pipelines/configuration/processors/write-json.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/write-json.md

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
* Correct kubectl commands

Signed-off-by: Luke Cousins <luke@cou.si>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _install-and-configure/install-opensearch/operator.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Luke Cousins <luke@cou.si>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
kolchfa-aws and others added 7 commits November 11, 2025 17:01
* Add 2.19.4 to version history

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _about/version-history.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…ct#11232)

* adding examples to key value processor data prepper

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>

* Update key-value.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Update key-value.md

Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Anton Rubin <anton.rubin@eliatra.com>
Signed-off-by: AntonEliatra <anton.rubin@eliatra.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
opensearch-project#11496)

* Updating the Cross Cluster Replication documentation for the index level ops batch size setting

Signed-off-by: Sagar Darji <darsaga@amazon.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _tuning-your-cluster/replication-plugin/settings.md

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Sagar Darji <darsaga@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: Sagar Darji <darsaga@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…opensearch-project#11476)

* Update delete_entries processor to add new features

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>

* update select_entries processor documentation to account for new include_keys_regex feature

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>

* fix style check errors and include additional pipeline configurations to clarify new features.

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/select-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/select-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/select-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/select-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/select-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/delete-entries.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Apply suggestions from code review

Signed-off-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>
Signed-off-by: Kennedy Onyia <145404406+kennedy-onyia@users.noreply.github.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Nathan Bower <nbower@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
…ensearch-project#11516)

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
@kolchfa-aws
Copy link
Collaborator

@peterzhuamazon Could you review this PR?

@kolchfa-aws
Copy link
Collaborator

@aryasoni98 Thanks for this addition! Looks like you've picked up some extraneous commits - could you rebase against main so this PR contains only the relevant commits?

@kolchfa-aws kolchfa-aws added Tech review PR: Tech review in progress backport 3.3 labels Nov 11, 2025
@peterzhuamazon
Copy link
Member

Hi, thanks for contributing,

I have a few comments on initial glance:

  1. Can we put all the new pdf code in a single folder, instead of mixing up with existing root.
  2. It would be better if the pdf can be generated during website build and published on the same s3 bucket (which I can help), so it can be accessed by https://docs.opensearch.org/<>.pdf
  3. Is there anyway to use existing gem as part of jekyll build to generate a pdf, instead of using another set of nodejs for the purpose.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 3.3 Tech review PR: Tech review in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DOC] Downloadable PDF Developer Guides

8 participants