Skip to content

Conversation

@jiridanek
Copy link
Member

@jiridanek jiridanek commented Nov 14, 2025

https://issues.redhat.com/browse/RHOAIENG-32541
https://issues.redhat.com/browse/RHAIENG-948

Description

How Has This Been Tested?

Self checklist (all need to be checked):

  • Ensure that you have run make test (gmake on macOS) before asking for review
  • Changes to everything except Dockerfile.konflux files should be done in odh/notebooks and automatically synced to rhds/notebooks. For Konflux-specific changes, modify Dockerfile.konflux files directly in rhds/notebooks as these require special attention in the downstream repository and flow to the upcoming RHOAI release.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Chores
    • Enhanced Docker build configuration for CPU-based images to optimize dependency resolution and support for multiple processor architectures.

add `libcurl-devel` to build dependencies for `ppc64le` and `s390x` in pyarrow and onnx builds

```
  -- Building AWS C++ SDK from source
  CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
    Could NOT find CURL (missing: CURL_LIBRARY CURL_INCLUDE_DIR)
```
add `libcurl-devel` to build dependencies for `ppc64le` and `s390x` in pyarrow and onnx builds

```
  -- Configuring done (1.8s)
  -- Generating done (0.1s)
  CMake Error at cmake_modules/ThirdpartyToolchain.cmake:5282 (set_property):
    The link interface of target "AWS::aws-c-cal" contains:

      OpenSSL::Crypto

    but the target was not found.  Possible reasons include:

      * There is a typo in the target name.
      * A find_package call is missing for an IMPORTED target.
      * An ALIAS target is missing.
```
…heel index

```
+ pip install --no-cache-dir -r requirements-build.txt

Looking in indexes: https://console.redhat.com/api/pypi/public-rhai/rhoai/3.0/cpu-ubi9/simple/

Ignoring oldest-supported-numpy: markers 'python_version < "3.9"' don't match your environment

ERROR: Could not find a version that satisfies the requirement cython>=0.29.31 (from versions: none)

ERROR: No matching distribution found for cython>=0.29.31
```
…eel index

```
+ pip install --no-cache-dir -r requirements.txt
  Looking in indexes: https://console.redhat.com/api/pypi/public-rhai/rhoai/3.0/cpu-ubi9/simple/
  ERROR: Could not find a version that satisfies the requirement numpy>=1.22 (from versions: none)
  ERROR: No matching distribution found for numpy>=1.22
  subprocess exited with status 1
```
…in aipcc wheel index

```
+ pip wheel . -w /root/onnx_wheel
...
        ERROR: Could not find a version that satisfies the requirement protobuf>=4.25.1 (from versions: none)
        ERROR: No matching distribution found for protobuf>=4.25.1
        [end of output]
```
…aipcc wheel index

```
+ pip install --no-cache-dir /tmp/wheels/pyarrow-17.0.0-cp312-cp312-linux_ppc64le.whl
Looking in indexes: https://console.redhat.com/api/pypi/public-rhai/rhoai/3.0/cpu-ubi9/simple/
Processing /tmp/wheels/pyarrow-17.0.0-cp312-cp312-linux_ppc64le.whl
INFO: pip is looking at multiple versions of pyarrow to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement numpy>=1.16.6 (from pyarrow) (from versions: none)
ERROR: No matching distribution found for numpy>=1.16.6
```
…d dependencies in aipcc wheel index

```
  × Failed to download and build `cryptography==43.0.3`
  ├─▶ Failed to resolve requirements from `build-system.requires`
  ├─▶ No solution found when resolving: `maturin>=1, <2`, `cffi>=1.12
  │   ; platform_python_implementation != 'PyPy'`, `setuptools!=74.0.0,
  │   !=74.1.0, !=74.1.1, !=74.1.2, !=74.1.3, !=75.0.0, !=75.1.0, !=75.2.0`
  ╰─▶ Because only maturin==1.9.6 is available and maturin==1.9.6 has no
      wheels with a matching platform tag (e.g., `manylinux_2_34_ppc64le`), we
      can conclude that maturin==1.9.6 cannot be used.
      And because you require maturin==1.9.6, we can conclude that your
      requirements are unsatisfiable.

      hint: Wheels are available for `maturin` (v1.9.6) on the following
      platforms: `linux_aarch64`, `linux_x86_64`
```
…ld pillow

```

      line 20, in run_setup
          return super().run_setup(setup_script)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File
      "/tmp/.tmpoAXecA/builds-v0/.tmpeN0QF3/lib64/python3.12/site-packages/setuptools/build_meta.py",
      line 317, in run_setup
          exec(code, locals())
        File "<string>", line 1108, in <module>
      RequiredDependencyException:
      The headers or library files could not be found for zlib,
      a required dependency when compiling Pillow from source.
```
…el to build pillow

```
        The headers or library files could not be found for jpeg,
        a required dependency when compiling Pillow from source.

        Please see the install instructions at:

        https://pillow.readthedocs.io/en/latest/installation/basic-installation.html
```
…build maturin

```
        Could not find openssl via pkg-config:
        pkg-config exited with status code 1
        > PKG_CONFIG_PATH=/usr/local/lib/pkgconfig/
      PKG_CONFIG_ALLOW_SYSTEM_CFLAGS=1 pkg-config --libs --cflags openssl
        The system library `openssl` required by crate `openssl-sys` was not
      found.
        The file `openssl.pc` needs to be installed and the PKG_CONFIG_PATH
      environment variable must contain its parent directory.
        PKG_CONFIG_PATH contains the following:
            - /usr/local/lib/pkgconfig/
        HINT: you may need to install a package such as openssl, openssl-dev
      or openssl-devel.
...
        Could not find directory of OpenSSL installation, and this `-sys`
      crate cannot
        proceed without this knowledge. If OpenSSL is installed and this crate
      had
        trouble finding it,  you can set the `OPENSSL_DIR` environment
      variable for the
        compilation process.
        Make sure you also have the development packages of openssl installed.
        For example, `libssl-dev` on Ubuntu or `openssl-devel` on Fedora.
...
      warning: build failed, waiting for other jobs to finish...
      💥 maturin failed
        Caused by: Failed to build a native library through cargo
        Caused by: Cargo build finished with "exit status:
```
…scikit-learn

```
  × Failed to download and build `scikit-learn==1.7.2`
  ├─▶ Failed to install requirements from `build-system.requires`
  ├─▶ Failed to build `scipy==1.16.3`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `mesonpy.build_wheel` failed (exit status: 1)
      [stdout]
      + meson setup
...
      ../meson.build:88:0: ERROR: Unknown compiler(s): [['gfortran'],
      ['flang-new'], ['flang'], ['nvfortran'], ['pgfortran'], ['ifort'],
      ['ifx'], ['g95']]
      The following exception(s) were encountered:
      Running `gfortran --help` gave "[Errno 2] No such file or directory:
      'gfortran'"
```
@openshift-ci openshift-ci bot requested review from atheo89 and daniellutz November 14, 2025 10:37
@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Nov 14, 2025
@openshift-ci openshift-ci bot added the size/m label Nov 14, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 14, 2025

Walkthrough

Updated Dockerfile to enhance PowerPC/s390x architecture support by adding pre-run subscription-manager setup, expanding build dependencies (libcurl-devel, openssl-devel, zlib-devel, libjpeg-turbo-devel, gcc-gfortran), configuring extra PyPI index access for wheel installation and ONNX builds, and introducing architecture-specific preparation blocks.

Changes

Cohort / File(s) Summary
Subscription Manager & Root Setup
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu
Added pre-run subscription-manager refresh on cpu-base stage
PowerPC/s390x Build Dependencies
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu
Expanded pyarrow-builder dependencies with libcurl-devel, openssl-devel; added architecture-specific preparation block (ppc64le) with zlib-devel, libjpeg-turbo-devel, openssl-devel, gcc-gfortran
PyPI Index & Wheel Installation
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu
Configured extra PyPI index (pypi.org/simple) for ONNX-related builds, wheel installation steps, and pylock.toml-based installations on Power architectures
ONNX/OpenBLAS Architecture Support
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu
Extended ONNX/OpenBLAS installation paths for Power architectures to utilize extra PyPI index for prebuilt wheels

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Specific areas requiring attention:
    • Verify subscription-manager command availability and error handling on all base images
    • Validate that additional build dependencies (zlib-devel, libjpeg-turbo-devel, gcc-gfortran) don't introduce unexpected conflicts or bloat
    • Confirm the extra PyPI index (pypi.org/simple) is reliable and contains all required packages for ppc64le/s390x architectures
    • Review the architecture-specific conditional logic to ensure correct stage application across cpu-base multi-stage build
    • Check that wheel installation commands with extra index work correctly when prebuilt wheels are unavailable or network access is restricted

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete and lacks essential information required by the template. Complete the 'How Has This Been Tested?' section with detailed testing steps, environment details, and test results. Ensure all merge criteria checkboxes are verified and checked appropriately.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: fixing build failures for jupyter datascience on IBM/Power architectures due to missing curl and openssl-devel packages and AIPCC wheel prerequisites.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added size/m and removed size/m labels Nov 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (1)

189-190: Consistent use of extra PyPI index is pragmatic but warrants a comment on sustainability.

The --extra-index-url https://pypi.org/simple pattern appears consistently across Power-specific builds to compensate for gaps in the AIPCC index (missing cython, numpy, protobuf, etc.). While this is an appropriate workaround, consider adding a top-level comment in the file (near the first occurrence or in the file header) documenting that this is a temporary mitigation for AIPCC index limitations and should be revisited if/when the AIPCC index coverage improves.

Also applies to: 241-246, 347-348, 361-363, 428-428

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9c915d and 75199d2.

📒 Files selected for processing (1)
  • jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (8 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
🔇 Additional comments (4)
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (4)

43-49: Subscription-manager pre-flight setup is a solid defensive addition.

The pre-run check gracefully handles both registered and unregistered systems, ensuring repository refresh when available without failing the build. This is appropriate for systems managed by Red Hat's subscription service.


154-160: Build dependencies are well-organized and properly justified.

The array syntax improves maintainability, and the inline comments clearly explain why libcurl-devel and openssl-devel are required (AWS C++ SDK pulled in by pyarrow's S3 flag). This is a good fix for the AIPCC base image which lacks these packages.


416-429: Power-specific pip install flags and environment variables are properly configured.

The use of GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1, strict flags (--strict --verify-hashes --compile-bytecode), and the extra PyPI index for Power architectures demonstrates careful handling of build-time environment variability. The index-strategy and conditional branching are well-structured.


389-407: I need to examine the exact content around line 78-79 to see what packages are installed for s390x, and verify whether the new ppc64le block should also apply to s390x.

I need to examine the exact content of the Dockerfile around the s390x block (lines 75-90) to determine what packages are already installed for s390x and compare with the new ppc64le block.

I need to extract the exact Dockerfile content around both blocks to compare what packages each architecture receives. Let me run a script to get those specific lines.

Based on my analysis, I cannot definitively verify the concern raised in the review comment without examining the exact content of the Dockerfile at the specified lines. The script output shows that:

  1. Zlib and libjpeg are required by default when building Pillow from source
  2. Lines 152, 346, and 421 show shared installation blocks for both ppc64le and s390x architectures
  3. The new block (lines 389-407) installs zlib-devel, libjpeg-turbo-devel, gcc-gfortran, and openssl-devel only for ppc64le

The key issue is: if both architectures appear together in other shared blocks (PyArrow build, wheels copy, dependency installation), why does the new ppc64le-only block not also apply to s390x?

However, I need to see what packages are actually installed for s390x at line 78-79 to determine if s390x already has these packages or if this is a genuine gap. Without access to the actual Dockerfile content at those lines, I cannot make a conclusive determination.

Review comment structure inconsistency: The new ppc64le-only package installation block (lines 389–407) for pillow and scikit-learn compilation conflicts with the pattern elsewhere in the Dockerfile where s390x is included alongside ppc64le in shared installation blocks (lines 152, 346, 421).

Confirm whether s390x at lines 78–79 already installs zlib-devel, libjpeg-turbo-devel, and gcc-gfortran, or if these packages should also be added to an s390x-specific block to match the ppc64le treatment in lines 389–407.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ide-developer
Once this PR has been reviewed and has the lgtm label, please assign caponetto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai create github issue to capture

jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (1)
189-190: Consistent use of extra PyPI index is pragmatic but warrants a comment on sustainability.

The --extra-index-url https://pypi.org/simple pattern appears consistently across Power-specific builds to compensate for gaps in the AIPCC index (missing cython, numpy, protobuf, etc.). While this is an appropriate workaround, consider adding a top-level comment in the file (near the first occurrence or in the file header) documenting that this is a temporary mitigation for AIPCC index limitations and should be revisited if/when the AIPCC index coverage improves.

Also applies to: 241-246, 347-348, 361-363, 428-428

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a GitHub issue to capture the review comment about documenting the --extra-index-url workaround

gh issue create \
  --repo opendatahub-io/notebooks \
  --title "Document temporary AIPCC index workaround in Dockerfile" \
  --body "## Context

This issue tracks the suggestion from PR #2678 to add documentation about the use of \`--extra-index-url https://pypi.org/simple\` as a temporary workaround for gaps in the AIPCC index.

## Description

The \`--extra-index-url https://pypi.org/simple\` pattern appears consistently across Power-specific builds (ppc64le/s390x) in \`jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu\` to compensate for missing packages in the AIPCC index (e.g., cython, numpy>=1.22, protobuf>=4.25.1, cryptography dependencies).

## Requested Action

Add a top-level comment in the Dockerfile (near the first occurrence or in the file header) documenting that:
- This is a temporary mitigation for AIPCC index limitations
- It should be revisited if/when the AIPCC index coverage improves
- List the known missing packages that necessitate this workaround

## Affected Lines

The pattern appears at multiple locations:
- Lines 189-190 (pyarrow builder)
- Lines 241-246 (onnx builder)
- Lines 347-348 (pyarrow wheel installation)
- Lines 361-363 (onnx wheel installation)
- Line 428 (pylock.toml installation)

## References

- PR: https://github.com/opendatahub-io/notebooks/pull/2678
- Review comment: https://github.com/opendatahub-io/notebooks/pull/2678#discussion_r2527010335
- Requested by: @jiridanek" \
  --assignee jiridanek

Length of output: 55


@jiridanek I've created GitHub issue #2679 to capture this documentation improvement: #2679

The issue documents the need to add a top-level comment in the Dockerfile explaining that the --extra-index-url https://pypi.org/simple pattern is a temporary workaround for AIPCC index limitations, along with references to the affected lines and known missing packages (cython, numpy>=1.22, protobuf>=4.25.1, cryptography dependencies).

@jiridanek jiridanek added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 14, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

@jiridanek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror 75199d2 link true /test notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror
ci/prow/images 75199d2 link true /test images
ci/prow/notebooks-py312-ubi9-e2e-tests 75199d2 link true /test notebooks-py312-ubi9-e2e-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jiridanek jiridanek merged commit 72a592a into opendatahub-io:main Nov 14, 2025
15 of 20 checks passed
@jiridanek jiridanek deleted the jd_curl_devel_odh branch November 14, 2025 11:46
@openshift-ci openshift-ci bot added size/m and removed size/m labels Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/m tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants