Skip to content

Conversation

@emekanwaoma
Copy link
Contributor

@emekanwaoma emekanwaoma commented Oct 21, 2025

User description

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.

Added docs pages

Please also include the path for the added docs

  • Quickstart (/)
  • Blueprint (/platform-overview/port-components/blueprint)
  • ...

Updated docs pages

Please also include the path for the updated docs

  • Quickstart (/)
  • Blueprint (/platform-overview/port-components/blueprint)
  • ...

PR Type

Documentation


Description

  • Update import paths in examples file to use ./examples/ subdirectory

  • Add clarification on classic PAT organization scoping behavior

  • Document entity deletion threshold for cleanup after restricting organizations

  • Add detailed guidance for first-time installs with multiple organizations


Diagram Walkthrough

flowchart LR
  A["GitHub Ocean Docs"] --> B["Import Path Updates"]
  A --> C["Organization Scoping Clarification"]
  A --> D["Entity Deletion Threshold Guide"]
  C --> E["First-time Install Behavior"]
  D --> F["Cleanup Instructions"]
Loading

File Walkthrough

Relevant files
Documentation
examples.md
Reorganize import paths to examples subdirectory                 

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/examples.md

  • Updated all import paths to use ./examples/ subdirectory prefix
  • Reorganized import statements to reflect new directory structure
  • Maintains all existing example blueprints and configurations
+50/-50 
github-ocean.md
Clarify organization scoping and first-time install behavior

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/github-ocean.md

  • Added clarification that classic PAT without organizations specified
    syncs all scoped organizations
  • Added note about first-time installs potentially syncing more than
    intended
  • Added reference to installation guide for cleanup procedures
  • Emphasized performance considerations for multi-organization syncing
+3/-0     
installation.mdx
Add entity deletion threshold guidance for organization cleanup

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/installation/installation.mdx

  • Added new section "Cleanup after restricting organizations" with
    detailed instructions
  • Documented entityDeletionThreshold: 1 configuration for entity cleanup
  • Provided YAML example showing how to configure threshold temporarily
  • Clarified classic PAT behavior when organizations are not specified
  • Added collapsible details section with complete configuration example
+29/-1   

@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Oct 21, 2025

PR Compliance Guide 🔍

(Compliance updated until commit 55bb1e4)

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Documentation Only: The PR adds documentation content and import path changes without introducing executable
code that could or should implement audit logging, so compliance cannot be assessed from
the diff alone.

Referred Code
- **With classic PAT**:
  - Specify organizations in port mapping: `organizations: ["org1", "org2", "org3"]`
  - If `organizations` are not specified, the integration will sync all organizations the classic PAT is scoped to.
- **With GitHub App or Fine-grained PAT**: Specify exactly one organization by setting the `githubOrganization` in the environment variables: `githubOrganization: "my-org"`

**Precedence:** If `githubOrganization` is set in the environment variables or config and `organizations` are also listed in port mapping, the integration prioritizes single‑organization behavior and syncs only the `githubOrganization`.

**Performance consideration:** Syncing multiple organizations will increase the number of API calls to GitHub and may slow down the integration. The more organizations you sync, the longer the resync time and the higher the API rate limit consumption. Consider syncing only the organizations you need.

**Default mapping behavior:** First‑time installs may sync more than intended, since organizations aren’t scoped yet. Refer to [installation guide](./installation) on how to ensure a clean catalogue after you scope out required organization.
:::

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status:
No Runtime Code: The changes are MDX import path adjustments and documentation text; there are no new
identifiers or executable code to evaluate for naming quality.

Referred Code
import RepositoryBlueprint from './examples/\_github_exporter_example_repository_blueprint.mdx'
import PRBlueprint from './examples/\_github_exporter_example_pull_request_blueprint.mdx'
import PortAppConfig from './examples/\_github_exporter_example_port_app_config.mdx'
import GitHubResources from './\_github_exporter_supported_resources.mdx'

import UsersBlueprint from './examples/example-repository-admins/\_github_exporter_example_users_blueprint.mdx'
import GithubUsersBlueprint from './examples/example-repository-admins/\_github_exporter_example_github_users_blueprint.mdx'
import RepositoryAdminBlueprint from './examples/example-repository-admins/\_github_export_example_repository_with_admins_relation_blueprint.mdx'
import RepositoryAdminAppConfig from './examples/example-repository-admins/\_github_exporter_example_admins_users_port_app_config.mdx'

import IssueBlueprint from './examples/example-issue/\_git_exporter_example_issue_blueprint.mdx'
import PortIssueAppConfig from './examples/example-issue/\_github_exporter_example_issue_port_app_config.mdx'
import RepoEnvironmentBlueprint from './examples/example-deployments-environments/\_github_exporter_example_environment_blueprint.mdx'
import DeploymentBlueprint from './examples/example-deployments-environments/\_github_exporter_example_deployment_blueprint.mdx'
import PortRepoDeploymentAndEnvironmentAppConfig from './examples/example-deployments-environments/\_github_exporter_example_deployments_and_environments_port_app_config.mdx'

import TagBlueprint from './examples/example-repository-release-tag/\_github_exporter_example_tag_blueprint.mdx'
import ReleaseBlueprint from './examples/example-repository-release-tag/\_github_exporter_example_release_blueprint.mdx'
import RepositoryTagReleaseAppConfig from './examples/example-repository-release-tag/\_github_exporter_example_release_tag_port_app_config.mdx'

import PackageBlueprint from './examples/example-file-kind/\_example_package_blueprint.mdx'


 ... (clipped 30 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Docs Only Change: The diff introduces documentation and example YAML snippets without executable logic, so
error handling and edge case management cannot be evaluated from the provided changes.

Referred Code
Starting from **version 3.0.0-beta**, the GitHub integration supports syncing data from multiple GitHub organizations.

:::info Multi-organization configuration
- GitHub App and fine-grained PAT: use `githubOrganization` (single organization).
- Classic PAT:
  - To sync multiple organizations, list them in your port mapping under `organizations`.

    ```yaml showLineNumbers
    deleteDependentEntities: true
    createMissingRelatedEntities: true
    enableMergeEntity: true
    organizations:
      - org1
      - org2
    # ... rest of your mapping (repositoryType, resources, etc.) ...
    ```

  - If `organizations` are not specified, the integration will sync all organizations the classic PAT is scoped to.
- Precedence: if `githubOrganization` is set in the environment variables or config and `organizations` are listed in port mapping, the integration syncs only the `githubOrganization` (single‑org behavior).
- Performance: syncing multiple organizations increases API calls and may slow down the integration.
:::

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Config Guidance Only: The PR adds configuration guidance and notes about organization scoping without changing
input handling code, so validation and secure data handling cannot be confirmed from this
diff.

Referred Code
Starting from **version 3.0.0-beta**, the GitHub integration supports syncing data from multiple GitHub organizations.

:::info Multi-organization configuration
- GitHub App and fine-grained PAT: use `githubOrganization` (single organization).
- Classic PAT:
  - To sync multiple organizations, list them in your port mapping under `organizations`.

    ```yaml showLineNumbers
    deleteDependentEntities: true
    createMissingRelatedEntities: true
    enableMergeEntity: true
    organizations:
      - org1
      - org2
    # ... rest of your mapping (repositoryType, resources, etc.) ...
    ```

  - If `organizations` are not specified, the integration will sync all organizations the classic PAT is scoped to.
- Precedence: if `githubOrganization` is set in the environment variables or config and `organizations` are listed in port mapping, the integration syncs only the `githubOrganization` (single‑org behavior).
- Performance: syncing multiple organizations increases API calls and may slow down the integration.
:::

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit a6b14f7
Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true -->

</details></td></tr>
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Oct 21, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Improve product for safer initial sync

The suggestion highlights that the documented manual cleanup process, which
requires disabling the entityDeletionThreshold safety feature, is risky and
indicates a product gap. It recommends enhancing the integration to provide a
safer, automated way to manage initial syncs and cleanups.

Examples:

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/installation/installation.mdx [42-66]
#### Cleanup after restricting organizations

If you initially synced multiple organizations (e.g., on a first‑time install before your mapping was ready) and later restrict organizations in your mapping, set `entityDeletionThreshold: 1` temporarily and resync to ensure previously ingested, now‑unwanted entities are deleted. Revert the threshold afterwards (default ≈ 0.9) or remove completely.

<details>
<summary><b>Entity deletion threshold (click to expand)</b></summary>

```yaml showLineNumbers

entityDeletionThreshold: 1

 ... (clipped 15 lines)

Solution Walkthrough:

Before:

# docs/.../installation.mdx (User instructions)

#### Cleanup after restricting organizations

If you synced too many organizations, you must:
1. Restrict organizations in your mapping.
2. Temporarily set `entityDeletionThreshold: 1`.
3. Resync to delete unwanted entities.
4. Revert the `entityDeletionThreshold` back to its default.

Example config:
```yaml
entityDeletionThreshold: 1
resources:
  ...



#### After:
```markdown
# Ideal integration logic (conceptual)

class GithubIntegration:
    def on_config_change(old_config, new_config):
        removed_orgs = old_config.orgs - new_config.orgs
        # Automatically and safely clean up entities
        # from the removed organizations without requiring
        # manual threshold changes.
        if is_initial_setup_cleanup(removed_orgs):
            cleanup_entities_for_orgs(removed_orgs)

    def resync():
        # ... regular sync logic ...
        # The safety threshold is always active for normal operations.
        if count_deletions() / total_entities > entityDeletionThreshold:
            raise Error("Too many entities to delete, aborting.")
        ...

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies that the PR documents a risky manual workaround for a product limitation, and proposes a more robust, user-friendly product enhancement instead of relying on users to temporarily disable a safety feature.

High
Possible issue
Clarify YAML example to prevent data loss
Suggestion Impact:The commit removed the previous standalone "Entity deletion threshold" YAML example that could be misread as a full config, aligning with the suggestion’s intent to prevent misleading replacement of users' configs.

code diff:

-#### Cleanup after restricting organizations
-
-If you initially synced multiple organizations (e.g., on a first‑time install before your mapping was ready) and later restrict organizations in your mapping, set `entityDeletionThreshold: 1` temporarily and resync to ensure previously ingested, now‑unwanted entities are deleted. Revert the threshold afterwards (default ≈ 0.9) or remove completely.
-
-<details>
-<summary><b>Entity deletion threshold (click to expand)</b></summary>
-
-```yaml showLineNumbers
-
-entityDeletionThreshold: 1
-resources:
-  - kind: organization
-    selector:
-      query: 'true'
-    port:
-      entity:
-        mappings:
-          identifier: .login
-          title: .login
-          blueprint: '''githubOrganization'''
-          properties:
-            login: .login
-            id: .id
-```
-</details>

Update the YAML example for entityDeletionThreshold to clarify that it should be
added to the user's existing configuration, not replace it, to prevent
accidental removal of other resource mappings.

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/installation/installation.mdx [46-66]

 <details>
-<summary><b>Entity deletion threshold (click to expand)</b></summary>
+<summary><b>Example: Adding the entity deletion threshold to your configuration (click to expand)</b></summary>
 
 ```yaml showLineNumbers
+# Add this line at the top of your existing configuration file
+entityDeletionThreshold: 1
 
-entityDeletionThreshold: 1
 resources:
-  - kind: organization
+  # ... your existing resource mappings remain here ...
+  - kind: repository
     selector:
-      query: 'true'
+      query: "true"
     port:
       entity:
         mappings:
-          identifier: .login
-          title: .login
-          blueprint: '''githubOrganization'''
-          properties:
-            login: .login
-            id: .id
+          # ...
+  - kind: pull-request
+    selector:
+      query: "true"
+    port:
+      entity:
+        mappings:
+          # ...
+  # ... etc.
```

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the documentation example is misleading and could cause users to misconfigure the integration by replacing their entire resource mapping, which would lead to an incomplete sync.

Medium
  • Update

@aws-amplify-eu-west-1
Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-2932.d2ngvl90zqbob8.amplifyapp.com

Copy link
Member

@mk-armah mk-armah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@emekanwaoma emekanwaoma changed the title Add Documentation Note for Deleting Unwanted Entities After First Time Installs of GitHub Ocean Multi Orgs Add Documentation Note on Ingesting GitHub Ocean Multi Orgs for First Time Installations Nov 10, 2025
@hadar-co hadar-co merged commit eb988bf into main Nov 10, 2025
5 checks passed
@hadar-co hadar-co deleted the PORT-16636 branch November 10, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants