diff --git a/docs/DeploymentGuideDatabricks.md b/docs/DeploymentGuideDatabricks.md index f3cd6b1..70fa7c5 100644 --- a/docs/DeploymentGuideDatabricks.md +++ b/docs/DeploymentGuideDatabricks.md @@ -214,11 +214,15 @@ Check the terminal for success or error messages. See Troubleshooting if issues After Databricks deployment, you can mirror the Unity Catalog in Microsoft Fabric: 1. Open the Fabric / Power BI portal (https://powerbi.com) and sign in to the target Fabric workspace. -2. Click **+ New** → **Mirrored Azure Databricks catalog**. +2. Create a new folder named **databricks** in your workspace: + - Click **+ New** → **Folder**. + - Name the folder `databricks`. + - Navigate into the newly created folder. +3. Click **+ New** → **Mirrored Azure Databricks catalog**. ![mirror Catalog](./images/deployment/mirrorcatlogimage.png) -3. Authenticate using the previously created connection, then select the catalog, schemas, and tables you want to mirror. +4. Authenticate using the previously created connection, then select the catalog, schemas, and tables you want to mirror. ![mirrorCatalog Connection setup](./images/deployment/mirrorcatlog-connectionsetup.png) - 4. Review and create the mirrored item. Monitor sync status. +5. Review and create the mirrored item. Monitor sync status. --- diff --git a/docs/DeploymentGuideFabric.md b/docs/DeploymentGuideFabric.md index 2a8fcf4..e458cb8 100644 --- a/docs/DeploymentGuideFabric.md +++ b/docs/DeploymentGuideFabric.md @@ -2,9 +2,199 @@ Deploy the **Unified Data Foundation with Fabric** solution accelerator using Azure Developer CLI - get a complete data platform with medallion architecture in minutes. -## 🚀 Quick Start +--- + +## Key Sections + +| Section | Description | +|---------|-------------| +| [Prerequisites](#1-prerequisites) | Required permissions, tools, and setup | +| [Deployment Overview](#2-deployment-overview) | Overview of deployed resources and architecture | +| [Deployment Options](#3-deployment-options) | Local, cloud, and CI/CD deployment methods | +| [Deployment Commands](#4-deployment-commands) | One-command deployment instructions | +| [Deployment Results](#5-deployment-results) | Expected outcomes and verification steps | +| [Advanced Configuration Options](#6-advanced-configuration-options) | Optional customization parameters | +| [Known Limitations](#7-known-limitations) | Important constraints to review | +| [Environment Cleanup](#8-environment-cleanup) | How to remove deployed resources | +| [Additional Resources](#9-additional-resources) | Support and further reading | + +### Alternative Deployment Methods + +This guide focuses on automated deployment using Azure Developer CLI. For +manual deployment or existing Fabric capacity integration, refer to the +[Manual Deployment Guide](./DeploymentGuideFabricManual.md). + +--- + +## 1. Prerequisites + +To deploy this solution, ensure you have the following tools and permissions. + +### Software Requirements + +You need these tools installed to run the deployment commands. + +| Tool | Version | Purpose | Download | +|------|---------|---------|----------| +| **Azure Developer CLI** | Latest | Orchestrates deployment | [Install azd](https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd) | +| **Azure CLI** | Latest | Authentication | [Install az](https://learn.microsoft.com/cli/azure/install-azure-cli) | +| **Python** | 3.9+ | Fabric configuration scripts | [Install Python](https://www.python.org/downloads/) | + +> **💡 Tip**: You can skip installing tools by using [Azure Cloud Shell](https://shell.azure.com) or GitHub Codespaces. + +### Permissions + +Your deployment identity (User or Service Principal) requires the following permissions. + +#### 🔐 Azure Permissions + +- **Resource Group Access**: Ensure your deployment identity has permissions on target Resource Group to deploy Bicep templates and create Azure resources using appropriate [Azure RBAC built-in roles](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles) (e.g. has [Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#contributor) or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner)) or appropriate [Azure RBAC custom role](https://learn.microsoft.com/azure/role-based-access-control/custom-roles) with necessary permissions +- **`Microsoft.Fabric` Resource Provider Access**: Verify your Azure Subscription has [Microsoft.Fabric resource provider](https://learn.microsoft.com/azure/azure-resource-manager/management/azure-services-resource-providers) enabled and your deployment identity has permissions on Resource Group to create [Microsoft Fabric capacity resource](https://learn.microsoft.com/azure/templates/microsoft.fabric/capacities?pivots=deployment-language-bicep) + +#### 🔗 API Permissions + +- **Microsoft Graph API - `User.Read`**: Delegated permission to read signed-in user profile information using [Microsoft Graph User permissions](https://learn.microsoft.com/graph/permissions-reference#user-permissions) +- **Microsoft Graph API - `openid`**: Delegated permission for sign in and user profile authentication using [OpenID Connect scopes](https://learn.microsoft.com/entra/identity-platform/scopes-oidc) +- **Fabric REST API - Workspace Management**: Access to create and manage Fabric workspaces for workspace structure deployment using [Fabric workspace APIs](https://learn.microsoft.com/rest/api/fabric/core/workspaces) +- **Fabric REST API - Item Creation**: Access to create lakehouses, notebooks, and reports for Fabric content deployment using [Fabric item APIs](https://learn.microsoft.com/rest/api/fabric/core/items) +- **Fabric REST API - Content Upload**: Access to upload files and manage workspace content for sample data and notebook deployment using [Fabric REST API scopes](https://learn.microsoft.com/rest/api/fabric/articles/scopes) +- **Power BI API - `Tenant.Read.All`**: Delegated permission to read organization's Power BI tenant information using [Power BI REST API permissions](https://learn.microsoft.com/rest/api/power-bi/#scopes) + +#### ✅ Quick Check + +Run this command to verify your tools are ready: + +```bash +# Check Azure CLI +az --version +az account show + +# Check Azure Developer CLI +azd version + +# Check Python +python --version +``` + +## 2. Deployment Overview + +This solution accelerator uses a **two-phase deployment approach** to provision a complete data platform. The process is fully automated, idempotent, and safe to re-run. + +### 1️⃣ Phase 1: Infrastructure (Azure) + +*Powered by Bicep & Azure Resource Manager* +This phase creates the physical resources in your Azure subscription. + +- **Resource Group**: A container for your resources. +- **Fabric Capacity**: The compute engine (F SKU) that powers your data workloads. +- **Managed Identity**: The identity used for secure automation. + +### 2️⃣ Phase 2: Data Platform (Fabric) + +*Powered by Python & Fabric REST APIs* +This phase configures the logical architecture inside Microsoft Fabric. + +- **Workspace**: Creates or configures the workspace on your Capacity. +- **Lakehouses**: Deploys the Medallion Architecture (`Bronze` → `Silver` → `Gold`). +- **Notebooks**: Uploads 40+ notebooks for data processing and orchestration. +- **Sample Data**: Ingests sample datasets (Finance, Sales) into the Bronze layer. +- **Power BI**: Deploys pre-built reports and dashboards. + +### 🔄 Idempotency & Re-runs + +The deployment is designed to be **safe to re-run**. If you run `azd up` again: + +- **Infrastructure**: Only updates settings if they have changed (e.g., resizing Capacity). +- **Workspace**: Detects existing workspace and skips creation. +- **Content**: + - *Notebooks/Reports*: Updated to the latest version (overwrites changes). + - *Data*: Preserved (sample data is re-uploaded if missing). + - *Admins*: New admins are added; existing ones remain. + +The deployment orchestration coordinates both phases, passing deployment parameters and ensuring proper sequencing. See [deployment options](#3-deployment-options) for different ways to run this deployment based on your preferred environment. + +--- + +## 3. Deployment Options + +Choose your deployment environment based on your workflow and requirements. All options use the same [Deployment commands](#4-deployment-commands) with environment-specific setup. + +| Environment | Best For | Setup Required | Notes | +|-------------|----------|----------------|-------| +| **[Local Machine](#1-local-machine)** | Full development control | Install [software requirements](#software-requirements) | Most flexible, requires local setup | +| **[Azure Cloud Shell](#2-azure-cloud-shell)** | Zero setup | Just a web browser | Pre-configured tools, session timeouts | +| **[GitHub Codespaces](#3-github-codespaces)** | Team consistency | GitHub account | Cloud development environment | +| **[Dev Container](#4-vs-code-dev-container)** | Standardized tooling | Docker Desktop + VS Code | Containerized consistency | +| **[GitHub Actions](#5-github-actions-cicd)** | Automated CI/CD | Service principal setup | Production deployments | + +### 1. Local Machine + +Deploy with full control over your development environment. + +**Setup requirements**: Install the [software requirements](#software-requirements) + +**Deployment**: Use the standard [Deployment commands](#4-deployment-commands) + +### 2. Azure Cloud Shell + +Deploy from Azure's browser-based terminal with zero local installation. + +**Setup**: Open [Azure Cloud Shell](https://shell.azure.com) and install Azure Developer CLI: + +```bash + +curl -fsSL https://aka.ms/install-azd.sh | bash && exec bash +``` + +**Deployment**: Run the [Deployment commands](#4-deployment-commands) (Azure CLI pre-authenticated) + +### 3. GitHub Codespaces + +Deploy from a cloud development environment with pre-configured tools. + +**Setup**: + +1. Go to the [repository](https://github.com/microsoft/unified-data-foundation-with-fabric-solution-accelerator) +2. Click **Code** → **Codespaces** → **Create codespace** + +**Deployment**: Install azd and run [Deployment commands](#4-deployment-commands) with device authentication: + +```bash +# Install azd if needed +curl -fsSL https://aka.ms/install-azd.sh | bash && exec bash + +# Use device code authentication +az login --use-device-code +azd auth login --use-device-code + +# Continue with deployment commands +``` + +### 4. VS Code Dev Container + +Deploy from a containerized environment for team consistency. + +**Setup**: + +1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop) and [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) +2. Clone repository and open in VS Code +3. Reopen in container when prompted + +**Deployment**: All tools pre-installed - run [Deployment commands](#4-deployment-commands) directly + +### 5. GitHub Actions (CI/CD) + +Automated deployment using the included [workflow](../.github/workflows/azure-dev.yml). -**One-command deployment** - Deploy everything with Azure Developer CLI ([prerequisites required](#prerequisites)): +**Setup**: Configure [repository variables](https://docs.github.com/en/actions/learn-github-actions/variables) and set up [service principal with federated credentials](https://learn.microsoft.com/azure/developer/github/connect-from-azure) + +**Triggers**: Push to main branch or manual workflow dispatch + +--- + +## 4. Deployment Commands + +**One-command deployment** - Deploy everything with Azure Developer CLI ([prerequisites required](#1-prerequisites)): ```bash # Clone and navigate to repository @@ -23,81 +213,29 @@ azd up ``` During deployment, you'll specify: + - **Environment name** (e.g., "udfwf-dev"). This will be used to build the name of the deployed Azure resources. - **Azure subscription**. - **Azure resource group**. -**What you get**: Complete medallion architecture with Fabric capacity, lakhouses (Bronze/Silver/Gold), notebooks, sample data, and Power BI reports. +**What you get**: Complete medallion architecture with Fabric capacity, lakehouses (Bronze/Silver/Gold), notebooks, sample data, and Power BI reports. > **💡 Alternative Deployment Option** -> > This guide uses Azure Developer CLI for automated deployment. If you prefer more granular control or have an existing Fabric capacity, see the [Manual Deployment Guide](./DeploymentGuideFabricManual.md). ### Next Steps -- **First deployment**: Follow the commands above - they work in [multiple environments](#deployment-options) -- **Need different setup**: See [deployment environment options](#deployment-options) (Cloud Shell, Codespaces, etc.) -- **Understand the process**: Review [deployment overview](#deployment-overview) for technical details -- **See what's created**: Check [deployment results](#deployment-results) for detailed component overview with screenshots -- **Want to customize**: Explore [configuration options](#advanced-configuration-options) for naming, capacity sizing, and admin setup -- **Limitations**: Review [known limitations](#known-limitations) for common issues and workarounds -- **Remove environment**: Use [environment cleanup](#environment-cleanup) to completely remove your deployment - ---- - -## Prerequisites -Before starting, ensure your deployment identity has the following requirements. +- **First deployment**: Follow the commands above - they work in [multiple environments](#3-deployment-options) +- **Need different setup**: See [deployment environment options](#3-deployment-options) (Cloud Shell, Codespaces, etc.) +- **Understand the process**: Review [deployment overview](#2-deployment-overview) for technical details +- **See what's created**: Check [deployment results](#5-deployment-results) for detailed component overview with screenshots +- **Want to customize**: Explore [configuration options](#6-advanced-configuration-options) for naming, capacity sizing, and admin setup +- **Limitations**: Review [known limitations](#7-known-limitations) for common issues and workarounds +- **Remove environment**: Use [environment cleanup](#8-environment-cleanup) to completely remove your deployment -> **📋 Deployment Identity Types** -> -> The deployment can be executed using different identity types: -> - **User Account**: Interactive deployment using your Azure AD credentials -> - **Service Principal**: Application identity for automated/CI-CD scenarios -> - **Managed Identity**: Azure-managed identity for secure automated deployments -> -> For more details, see [Fabric Identity Support](https://learn.microsoft.com/rest/api/fabric/articles/identity-support) - -### 🔐 Azure Permissions -- [ ] **Resource Group Access**: Ensure your deployment identity has permissions on target Resource Group to deploy Bicep templates and create Azure resources using appropriate [Azure RBAC built-in roles](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles) (e.g. has [Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#contributor) or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner)) or appropriate [Azure RBAC custom role](https://learn.microsoft.com/azure/role-based-access-control/custom-roles) with necessary permissions -- [ ] **`Microsoft.Fabric` Resource Provider Access**: Verify your Azure Subscription has [Microsoft.Fabric resource provider](https://learn.microsoft.com/azure/azure-resource-manager/management/azure-services-resource-providers) enabled and your deployment identity has permissions on Resource Group to create [Microsoft Fabric capacity resource](https://learn.microsoft.com/azure/templates/microsoft.fabric/capacities?pivots=deployment-language-bicep) - -### 🔗 API Permissions -- [ ] **Microsoft Graph API - `User.Read`**: Delegated permission to read signed-in user profile information using [Microsoft Graph User permissions](https://learn.microsoft.com/graph/permissions-reference#user-permissions) -- [ ] **Microsoft Graph API - `openid`**: Delegated permission for sign in and user profile authentication using [OpenID Connect scopes](https://learn.microsoft.com/entra/identity-platform/scopes-oidc) -- [ ] **Fabric REST API - Workspace Management**: Access to create and manage Fabric workspaces for workspace structure deployment using [Fabric workspace APIs](https://learn.microsoft.com/rest/api/fabric/core/workspaces) -- [ ] **Fabric REST API - Item Creation**: Access to create lakehouses, notebooks, and reports for Fabric content deployment using [Fabric item APIs](https://learn.microsoft.com/rest/api/fabric/core/items) -- [ ] **Fabric REST API - Content Upload**: Access to upload files and manage workspace content for sample data and notebook deployment using [Fabric REST API scopes](https://learn.microsoft.com/rest/api/fabric/articles/scopes) -- [ ] **Power BI API - `Tenant.Read.All`**: Delegated permission to read organization's Power BI tenant information using [Power BI REST API permissions](https://learn.microsoft.com/rest/api/power-bi/#scopes) - -### 💻 Software Requirements -- [ ] **Python**: Install version 3.9+ as runtime environment for deployment scripts from [Download Python](https://www.python.org/downloads/) -- [ ] **Azure CLI**: Install latest version for Azure authentication and resource management from [Install Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli) -- [ ] **Azure Developer CLI**: Install latest version for simplified deployment orchestration from [Install Azure Developer CLI](https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd) --- -## Deployment Overview - -This solution accelerator uses a two-phase deployment approach that creates a complete data foundation solution with medallion architecture (Bronze-Silver-Gold). The deployment is designed to be **idempotent** and **safe to re-run**, intelligently detecting existing resources and only creating what's missing. - -The deployment executes in two coordinated phases using dedicated scripts: - -1. **Infrastructure Provisioning** - Executes [`main.bicep`](../infra/main.bicep) to create Azure resources using [ARM idempotency](https://learn.microsoft.com/azure/azure-resource-manager/templates/deployment-tutorial-local-template?tabs=azure-powershell#deploy-template): - - **Microsoft Fabric Capacity**: Dedicated compute resources with configured admin permissions (updates configuration if parameters change) - - **Resource Group**: Container for all Azure resources - -2. **Fabric Workspace Setup** - Runs [`run_python_script_fabric.ps1`](../infra/scripts/utils/run_python_script_fabric.ps1) orchestrator and [`create_fabric_items.py`](../infra/scripts/fabric/create_fabric_items.py) deployment script to intelligently manage Fabric resources: - - **Workspace**: Detects existing workspace by name or creates new one, assigns to specified capacity - - **Lakehouses**: Creates missing 3-tier medallion architecture (`maag_bronze`, `maag_silver`, `maag_gold`) while preserving existing data - - **Notebooks**: Updates existing notebooks with latest content or creates missing ones with proper lakehouse references ⚠️ *overwrites customizations* - - **Sample Data**: Uploads CSV files to bronze lakehouse ⚠️ *overwrites existing files with same names* - - **Power BI Reports**: Creates or overwrites dashboard components for data visualization ⚠️ *replaces existing reports with same names* - - **Administrators**: Adds new workspace administrators without removing existing ones - -The deployment orchestration coordinates both phases, passing deployment parameters and ensuring proper sequencing. See [deployment options](#deployment-options) for different ways to run this deployment based on your preferred environment. - ---- - -## Deployment Results +## 5. Deployment Results After successful deployment, you'll have a complete data platform implementing medallion architecture. @@ -120,7 +258,7 @@ Workspace created with the specified or default name. #### Folder Structure -``` +```text your-workspace/ ├── lakehouses/ # Bronze, Silver, Gold lakehouses ├── notebooks/ # Data transformation pipelines @@ -146,6 +284,7 @@ your-workspace/ #### Sample Data The solution includes sample data for: + - **Finance data**: accounts, invoices, payments - **Sales data**: orders, order lines, payments from multiple sources - **Shared reference data**: customers, products, locations, categories @@ -155,6 +294,7 @@ The solution includes sample data for: #### Notebooks **Automation Components**: + - **Orchestration notebooks**: `run_bronze_to_silver`, `run_silver_to_gold` - **Transformation notebooks**: Domain-specific data processing for each entity - **Management utilities**: Table operations, schema definitions, troubleshooting tools @@ -164,128 +304,28 @@ The solution includes sample data for: #### Power BI Reports Any `.pbix` files found in the `reports/` directory will be automatically deployed to the workspace's reports folder. The deployment process: + - Scans recursively through the reports directory - Uploads each Power BI report with conflict resolution (Create or Overwrite) - Assigns reports to the appropriate folder within the workspace - Provides deployment tracking and verification -**PowerBI files** +##### PowerBI files ![Screenshot of resulting PowerBI reports](./images/deployment/fabric/fabric_powerbi_reports.png) -**PowerBI Dashboard** +##### PowerBI Dashboard ![Screenshot of resulting PowerBI dashboard](./images/deployment/fabric/fabric_powerbi_dashboard.png) --- -## Deployment Options - -Choose your deployment environment based on your workflow and requirements. All options use the same [Quick Start commands](#quick-start) with environment-specific setup. - -| Environment | Best For | Setup Required | Notes | -|-------------|----------|----------------|-------| -| **[Local Machine](#local-machine)** | Full development control | Install [software requirements](#-software-requirements) | Most flexible, requires local setup | -| **[Azure Cloud Shell](#azure-cloud-shell)** | Zero setup | Just a web browser | Pre-configured tools, session timeouts | -| **[GitHub Codespaces](#github-codespaces)** | Team consistency | GitHub account | Cloud development environment | -| **[Dev Container](#vs-code-dev-container)** | Standardized tooling | Docker Desktop + VS Code | Containerized consistency | -| **[Visual Studio Code (WEB)](#visual-studio-code-web)** | Zero setup| Just a web browser | Web based VS Code, session timeouts | -| **[GitHub Actions](#github-actions-cicd)** | Automated CI/CD | Service principal setup | Production deployments | - -### Local Machine -Deploy with full control over your development environment. - -**Setup requirements**: Install the [software requirements](#-software-requirements) - -**Deployment**: Use the standard [Quick Start commands](#quick-start) - -### Azure Cloud Shell -Deploy from Azure's browser-based terminal with zero local installation. - -**Setup**: Open [Azure Cloud Shell](https://shell.azure.com) and install Azure Developer CLI: -```bash -curl -fsSL https://aka.ms/install-azd.sh | bash && exec bash -``` - -**Deployment**: Run the [Quick Start commands](#quick-start) (Azure CLI pre-authenticated) - -### GitHub Codespaces -Deploy from a cloud development environment with pre-configured tools. - -**Setup**: -1. Go to the [repository](https://github.com/microsoft/unified-data-foundation-with-fabric-solution-accelerator) -2. Click **Code** → **Codespaces** → **Create codespace** - -**Deployment**: Install azd and run [Quick Start commands](#quick-start) with device authentication: -```bash -# Install azd if needed -curl -fsSL https://aka.ms/install-azd.sh | bash && exec bash - -# Use device code authentication -az login --use-device-code -azd auth login --use-device-code - -# Continue with Quick Start deployment commands -``` - -### VS Code Dev Container -Deploy from a containerized environment for team consistency. - -**Setup**: -1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop) and [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) -2. Clone repository and open in VS Code -3. Reopen in container when prompted - -**Deployment**: All tools pre-installed - run [Quick Start commands](#quick-start) directly - -### Visual Studio Code (WEB) -Deploy from VS Code in the browser with zero local installation. - -1. Open the following link to launch VS Code Web: - - [![Open in Visual Studio Code Web](https://img.shields.io/static/v1?style=for-the-badge&label=Visual%20Studio%20Code%20(Web)&message=Open&color=blue&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/azure/?vscode-azure-exp=foundry&agentPayload=eyJiYXNlVXJsIjogImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9taWNyb3NvZnQvdW5pZmllZC1kYXRhLWZvdW5kYXRpb24td2l0aC1mYWJyaWMtc29sdXRpb24tYWNjZWxlcmF0b3IvcmVmcy9oZWFkcy9tYWluL2luZnJhL3ZzY29kZV93ZWIiLCAiaW5kZXhVcmwiOiAiL2luZGV4Lmpzb24iLCAidmFyaWFibGVzIjogeyJhZ2VudElkIjogIiIsICJjb25uZWN0aW9uU3RyaW5nIjogIiIsICJ0aHJlYWRJZCI6ICIiLCAidXNlck1lc3NhZ2UiOiAiIiwgInBsYXlncm91bmROYW1lIjogIiIsICJsb2NhdGlvbiI6ICIiLCAic3Vic2NyaXB0aW9uSWQiOiAiIiwgInJlc291cmNlSWQiOiAiIiwgInByb2plY3RSZXNvdXJjZUlkIjogIiIsICJlbmRwb2ludCI6ICIifSwgImNvZGVSb3V0ZSI6IFsiYWktcHJvamVjdHMtc2RrIiwgInB5dGhvbiIsICJkZWZhdWx0LWF6dXJlLWF1dGgiLCAiZW5kcG9pbnQiXX0=) -2. When prompted, sign in using your Microsoft account linked to your Azure subscription. - - Select the appropriate subscription to continue. -3. Once the solution opens, the AI Foundry terminal will automatically start running the following command to install the required dependencies: - ```bash - sh install.sh - ``` - During this process, you’ll be prompted with the message: - ``` - What would you like to do with these files? - - Overwrite with versions from template - - Keep my existing files unchanged - ``` - Choose “**Overwrite with versions from template**” and provide a unique environment name when prompted. - -4. Deployment: - ```bash - # Use device code authentication - az login --use-device-code - azd auth login --use-device-code - - # Optional: Customize workspace name - azd env set AZURE_FABRIC_WORKSPACE_NAME "My Analytics Platform" - - # Deploy everything - azd up - ``` - -### GitHub Actions (CI/CD) -Automated deployment using the included [workflow](../.github/workflows/azure-dev.yml). - -**Setup**: Configure [repository variables](https://docs.github.com/en/actions/learn-github-actions/variables) and set up [service principal with federated credentials](https://learn.microsoft.com/azure/developer/github/connect-from-azure) - -**Triggers**: Push to main branch or manual workflow dispatch - ---- - -## Advanced Configuration Options +## 6. Advanced Configuration Options The solution accelerator provides flexible configuration options to customize your deployment. Parameters can be configured through **Azure Developer CLI environment variables** (`azd env set`) for local deployments or through **GitHub repository variables** for CI/CD deployments. > **📁 Configuration Files Reference:** +> > - Infrastructure: [`infra/main.bicep`](../infra/main.bicep) - Azure resource definitions > - Deployment orchestration: [`azure.yaml`](../azure.yaml) - AZD project configuration > - CI/CD workflow: [`.github/workflows/azure-dev.yml`](../.github/workflows/azure-dev.yml) - GitHub Actions pipeline @@ -305,7 +345,7 @@ Configure the Azure infrastructure components through Bicep template parameters | **Fabric Capacity SKU** | `skuName` | Not directly supported* | Fabric capacity tier and performance level | `F2` | `F4`, `F8`, `F16`, `F32`, `F64`, `F128`, `F256`, `F512`, `F1024`, `F2048` | | **Enable Telemetry** | `enableTelemetry` | Not directly supported* | Enable/disable usage telemetry collection | `true` | `false` | -*_GitHub Actions can use additional parameters through Bicep parameter files or workflow modifications._ +*GitHub Actions can use additional parameters through Bicep parameter files or workflow modifications.* **Configuration Examples:** @@ -342,6 +382,7 @@ Modify [`azure-dev.yml`](../.github/workflows/azure-dev.yml) Deploy Infrastructu **Fabric Capacity SKU Selection Guide:** + - **F2-F4**: Development and testing environments - **F8-F32**: Small to medium production workloads - **F64-F256**: Large enterprise production workloads @@ -388,6 +429,7 @@ env: **Workspace Naming Best Practices:** + - Use descriptive names that indicate purpose and environment - Consider organizational naming conventions - Include environment indicators for multi-environment deployments (Dev, Test, Prod) @@ -407,9 +449,10 @@ Manage workspace administrators and security permissions for the Fabric workspac | **Fabric Admins** | `AZURE_FABRIC_ADMIN_MEMBERS` | Bicep output | List of administrators (UPNs and Service Principal IDs) | JSON array | `["user1@contoso.com", "12345678-1234-1234-1234-123456789012"]` | | **Admins by Object ID** | `AZURE_FABRIC_ADMIN_MEMBERS_BY_OBJECT_ID` | Not directly supported* | List of object IDs with fallback user/service principal detection | JSON array | `["87654321-4321-4321-4321-210987654321"]` | -*_GitHub Actions workflow uses Bicep output for admin configuration. See examples below for customization._ +*GitHub Actions workflow uses Bicep output for admin configuration. See examples below for customization.* **Administrator Types Supported:** + - **User Principal Names (UPNs)**: `user@domain.com` format for individual users - **Service Principal IDs**: GUID format for application registrations - **Object IDs**: Direct Azure AD object identifiers with automatic type detection @@ -456,6 +499,7 @@ azd up **Administrator Assignment Behavior:** + - **Automatic Default Admin**: The deployment identity (user or service principal) is automatically added as a Fabric capacity admin - **Duplicate Detection**: Prevents adding the same principal multiple times - **Fallback Logic**: Object ID method tries both User and ServicePrincipal types automatically @@ -463,6 +507,7 @@ azd up **Permission Requirements:** Administrators configured through these parameters will have **Admin** role on the Fabric workspace, providing: + - Full workspace management capabilities - Ability to manage workspace items (lakehouses, notebooks, reports) - User and permission management within the workspace @@ -540,7 +585,7 @@ These parameters are automatically optimized in [`azure-dev.yml`](../.github/wor --- -## Known Limitations +## 7. Known Limitations This section documents known limitations in the deployment process and their workarounds. @@ -548,7 +593,8 @@ This section documents known limitations in the deployment process and their wor **Issue**: Service Principals cannot update Power BI dataset parameters via API, resulting in HTTP 403 errors. -**Impact**: +**Impact**: + - During automated deployment, if deployment identity is a Service Principal or a Managed Identity, Power BI reports are deployed but dataset parameters (SQL endpoint connection strings) may not be automatically configured - Reports may show connection errors until manually configured @@ -570,7 +616,8 @@ except Exception as param_error: print(f"📋 Continuing deployment without dataset parameter updates...") ``` -**Workaround**: +**Workaround**: + - The deployment continues successfully despite this limitation - Follow the manual configuration steps in the [Power BI Deployment Guide](./DeploymentGuidePowerBI.md) to complete the report setup - This typically involves updating the `sqlEndpoint` and `database` parameters in the Power BI service @@ -582,6 +629,7 @@ except Exception as param_error: **Issue**: The deployment identity may lack permissions to query user object IDs from Azure Active Directory via Microsoft Graph API. **Impact**: + - When using `--fabricAdmins` with user principal names (UPNs), the script may fail to resolve user identities - Service Principals may successfully create workspaces but fail to add human users as administrators - This can result in workspaces that are only accessible to the deployment service principal @@ -605,12 +653,14 @@ def detect_principal_type(admin_identifier, graph_client=None): **Workarounds**: -1. **Use Object IDs Instead**: Configure administrators using the `--fabricAdminsByObjectId` parameter or `AZURE_FABRIC_ADMIN_MEMBERS_BY_OBJECT_ID` environment variable as described in the [advanced configuration options](#advanced-configuration-options): +1. **Use Object IDs Instead**: Configure administrators using the `--fabricAdminsByObjectId` parameter or `AZURE_FABRIC_ADMIN_MEMBERS_BY_OBJECT_ID` environment variable as described in the [advanced configuration options](#6-advanced-configuration-options): + ```bash azd env set AZURE_FABRIC_ADMIN_MEMBERS_BY_OBJECT_ID '["87654321-4321-4321-4321-210987654321"]' ``` - + The script automatically tries both User and ServicePrincipal types for object IDs: + ```python for principal_type in ["User", "ServicePrincipal"]: # Try both User and ServicePrincipal types @@ -619,7 +669,6 @@ def detect_principal_type(admin_identifier, graph_client=None): 2. **Post-Deployment Admin Assignment**: Use the dedicated admin management scripts: - [`add_fabric_workspace_admins.py`](../infra/scripts/fabric/add_fabric_workspace_admins.py) - Direct Python script for admin assignment - [`run_python_script_fabric_admins.ps1`](../infra/scripts/utils/run_python_script_fabric_admins.ps1) - PowerShell orchestrator script - These scripts can add administrators to all available Fabric workspaces after initial deployment. --- @@ -628,7 +677,8 @@ def detect_principal_type(admin_identifier, graph_client=None): **Issue**: Service Principals may lack sufficient permissions to access Microsoft Fabric REST APIs. -**Impact**: +**Impact**: + - Deployment fails during workspace creation or management operations - Graceful exit with clear guidance on permission requirements @@ -638,7 +688,7 @@ The [`create_fabric_items.py`](../infra/scripts/fabric/create_fabric_items.py) s ```python except FabricApiError as e: if e.status_code == 401: - print(f"⚠️ WARNING: Unauthorized access to Fabric APIs. Please review your Fabric permissions and Ensure you have proper Fabric licensing and permissions.") + print(f"⚠️ WARNING: Unauthorized access to Fabric APIs. Please review your Fabric permissions and ensure you have proper Fabric licensing and permissions.") print(" 📋 Check the following resources:") print(" • Fabric licenses: https://learn.microsoft.com/fabric/enterprise/licenses") print(" • Identity support: https://learn.microsoft.com/rest/api/fabric/articles/identity-support") @@ -647,16 +697,17 @@ except FabricApiError as e: ``` **Resolution**: + 1. **Verify Fabric Licensing**: Ensure your organization has appropriate [Microsoft Fabric licenses](https://learn.microsoft.com/fabric/enterprise/licenses) 2. **Review Identity Configuration**: Follow the [Fabric Identity Support](https://learn.microsoft.com/rest/api/fabric/articles/identity-support) documentation 3. **Configure Service Principal**: If using a service principal, ensure it's properly configured following [Create Entra App](https://learn.microsoft.com/rest/api/fabric/articles/get-started/create-entra-app) guidance -4. **Check API Permissions**: Verify the deployment identity has the required Fabric REST API permissions as listed in the [prerequisites](#prerequisites) +4. **Check API Permissions**: Verify the deployment identity has the required Fabric REST API permissions as listed in the [prerequisites](#1-prerequisites) The script performs a graceful exit (`sys.exit(0)`) rather than failing abruptly, allowing you to resolve permissions and retry the deployment. --- -## Environment Cleanup +## 8. Environment Cleanup When you no longer need your deployed environment, Azure Developer CLI provides a streamlined approach to completely remove all resources and clean up your Microsoft Fabric workspace. @@ -683,30 +734,36 @@ azd down Based on the [`azure.yaml`](../azure.yaml) configuration, the cleanup process follows these orchestrated steps: #### Phase 1: Fabric Workspace Cleanup (predown hook) + Before removing Azure infrastructure, the cleanup process first handles the Microsoft Fabric workspace: **Windows (PowerShell):** + ```powershell ./infra/scripts/utils/run_python_script_fabric_remove.ps1 ``` **Unix/Linux (PowerShell Core):** + ```bash ./infra/scripts/utils/run_python_script_fabric_remove.ps1 -SkipPythonVirtualEnvironment ``` This orchestration script ([`run_python_script_fabric_remove.ps1`](../infra/scripts/utils/run_python_script_fabric_remove.ps1)) manages: + - **Python Environment Setup**: Creates or reuses Python virtual environment with required dependencies - **Workspace Identification**: Locates the target workspace using environment variables or defaults - **Safe Deletion**: Executes the Python removal script with proper error handling and user guidance The core removal logic is handled by [`remove_fabric_workspace.py`](../infra/scripts/fabric/remove_fabric_workspace.py), which: + - **Workspace Lookup**: Finds the workspace by name or ID (defaults to "Unified Data Foundation with Fabric workspace") - **Comprehensive Removal**: Deletes all workspace items including notebooks, lakehouses, and datasets - **Confirmation Prompts**: Provides interactive confirmation to prevent accidental deletions - **Error Handling**: Gracefully handles missing workspaces or permission issues #### Phase 2: Azure Infrastructure Cleanup + After successful Fabric workspace removal, `azd down` proceeds to deprovision all Azure resources that were created through the [`main.bicep`](../infra/main.bicep) template, including: - **Microsoft Fabric Capacity**: Dedicated compute resources @@ -724,12 +781,11 @@ The cleanup process includes several safety mechanisms: --- -## Additional Resources +## 9. Additional Resources - **Documentation**: [Microsoft Fabric](https://learn.microsoft.com/fabric/) | [Azure Developer CLI](https://learn.microsoft.com/azure/developer/azure-developer-cli/) -- **Guides**: [Power BI Deployment](./DeploymentGuidePowerBI.md) | [FAQs](./FAQs.md) +- **Guides**: [Power BI Deployment](./DeploymentGuidePowerBI.md) | [FAQs](./FAQs.md) - **Repository**: [Solution Accelerator](https://github.com/microsoft/unified-data-foundation-with-fabric-solution-accelerator) For support, visit the [project repository](https://github.com/microsoft/unified-data-foundation-with-fabric-solution-accelerator) or engage with the Microsoft Fabric community. ---- \ No newline at end of file diff --git a/docs/DeploymentGuideFabricManual.md b/docs/DeploymentGuideFabricManual.md index b9b88b3..f0390f2 100644 --- a/docs/DeploymentGuideFabricManual.md +++ b/docs/DeploymentGuideFabricManual.md @@ -25,51 +25,13 @@ This guide describes how to deploy the **Unified Data Foundation with Fabric** s ### Optional Variables -- `AZURE_FABRIC_WORKSPACE_NAME`: Custom workspace name (defaults to generated name if not specified) +- `AZURE_FABRIC_WORKSPACE_NAME`: Custom workspace name if already exists (defaults to generated name, if not specified) -## Quick Manual Deployment -### 1. Set Environment Variables - -**Linux/macOS/Cloud Shell:** -```bash -export AZURE_FABRIC_CAPACITY_NAME="your-existing-capacity-name" -export AZURE_FABRIC_WORKSPACE_NAME="Custom Workspace Name" # Optional -``` - -**Windows PowerShell:** -```powershell -$env:AZURE_FABRIC_CAPACITY_NAME="your-existing-capacity-name" -$env:AZURE_FABRIC_WORKSPACE_NAME="Custom Workspace Name" # Optional -``` - -### 2. Clone Repository and Navigate - -```bash -git clone https://github.com/microsoft/unified-data-foundation-with-fabric-solution-accelerator.git -cd unified-data-foundation-with-fabric-solution-accelerator/infra/scripts/utils -``` - -### 3. Run Deployment Script - -**Linux/macOS/Cloud Shell:** -```bash -chmod +x run_python_script_fabric.ps1 -pwsh ./run_python_script_fabric.ps1 -``` - -**Windows PowerShell:** -```powershell -.\run_python_script_fabric.ps1 -``` - -> **Note**: Manual scripts do **not** create the Fabric capacity or Azure infrastructure. These must exist beforehand. For complete infrastructure deployment, use `azd up` instead. - ---- - -## Detailed Manual Deployment Steps +## Deployment Steps ### Step 1: Verify Prerequisites +Open a terminal and run the following commands to verify your environment: 1. **Check Azure CLI authentication:** ```bash diff --git a/docs/SetupDatabricks.md b/docs/SetupDatabricks.md index 2fdf162..b15307a 100644 --- a/docs/SetupDatabricks.md +++ b/docs/SetupDatabricks.md @@ -114,6 +114,8 @@ You will need the following values for deployment process later. Be sure to reco ![Catalog Location](./images/deployment/6-DatabricksCatalogLocation.png) +> **Note:** If you cannot find an external location, refer to the [Troubleshooting](#troubleshooting) section below. + --- ## Step 5: Connect Databricks to Fabric @@ -129,6 +131,51 @@ After setup, you can reuse this connection by choosing **Existing connection**. --- +## Troubleshooting + +### 1. If a managed location is not created by default in your Azure Databricks workspace + +Follow these steps to manually create an external location: + +1. **Navigate to External Locations in Databricks:** + - In your Databricks workspace, click **Catalog** (left menu). + - Click the **gear** icon at the top and select **External Locations**. + - Click **Create Location**. + +2. **Find the storage path URL:** + - Go to the [Azure Portal](https://portal.azure.com/). + - Navigate to your Azure Databricks resource group. + - Locate and click on the **Managed Resource Group**. + - In the managed resource group, find the **Storage Account** (usually named `dbstorage`). + - Click on the storage account and select **Containers** from the left menu. + - Note the **Container Name** (commonly `unity-catalog` or similar). + - Construct the storage path URL using this format: + ``` + abfss://@.dfs.core.windows.net/ + ``` + - Example: `abfss://unity-catalog@dbstorage123abc.dfs.core.windows.net/managed-location` + +3. **Create or select a Storage Credential:** + - In the Databricks **Create Location** dialog, you'll need to select or create a **Storage Credential** that has access to the storage account. + - If you need to create a new storage credential, you'll need the **Access Connector ID**: + - In the [Azure Portal](https://portal.azure.com/), go to your Azure Databricks resource group. + - Open the **Managed Resource Group**. + - Look for the **Access Connector for Azure Databricks** resource (named something like `-accessconnector`). + - Click on the Access Connector resource. + - Copy the **Resource ID** from the Overview page or Properties section. + - Use this Access Connector ID when creating the storage credential in Databricks. + +4. **Complete the external location setup:** + - Enter a **Location Name** for your external location. + - Paste the storage path URL you constructed in step 2. + - Select the storage credential from step 3. + - Click **Create** to finalize the external location. + +5. **Verify the managed location:** + - Once created, return to **Catalogs → Settings** to verify and copy the managed location URL. + + + ## Next Steps