update bicep and readme to remove deployment name (#9)

sophia-ramsey · web-flow · commit 477544d2acb6 · 2025-02-26T16:23:55.000-08:00
* change bicep files and readme to use AZURE_AI_AGENT_MODEL_NAME instead of AZURE_AI_CHAT_MODEL_NAME
* update bicep files to remove references to AZURE_AI_CHAT_DEPLOYMENT_NAME and have the deployment names default to the model name
* add Quota Recommendations section to the readme
diff --git a/README.md b/README.md
@@ -30,6 +30,15 @@ Make sure the following tools are installed:
 2. [Python 3.9+](https://www.python.org/downloads/)
 3. [Git](https://git-scm.com/downloads)
 
+#### Quota Recommendations (Optional)
+
+The default for the model capacity in deployment is 50k tokens. For optimal performance, it is recommended to increase to 100k tokens. You can change the capacity by following the steps in [setting capacity and deployment SKU](docs/deploy_customization.md#customizing-model-deployments).
+
+* Navigate to the [Azure AI Foundry Portal](https://ai.azure.com/)
+* Select the AI Project you are using for this template if you are not already in the project.
+* Select Management center from the bottom left navigation menu
+* Select Quota, click the GlobalStandard dropdown and select the model and region you are using for this accelerator to see your available quota. Please note GPT-4o mini and text-embedding-ada-002 are used as default.
+* Request more quota or delete any unused model deployments as needed.
 
 #### Bringing an existing AI project resource
 
@@ -86,14 +95,12 @@ At this point you could make changes to the code if required. However, no change
 
 #### Configure your Agent (optional)
 <!-- TODO where do we want this? probably after downloading the code -->
-For options on customizing the deployment to disable resources, change resource names, or customize the models, you can follow these steps in [deployment customizations](docs/deploy_customization.md) now.
-
-If you want to personalize your agent, you can change the default configuration for your agent. This can include changing the model, adding tools, and uploading files to the agent. More information can be found in [Customizing Model Deployments](docs/deploy_customization.md#customizing-model-deployments).
+If you want to personalize your agent, you can change the default configuration for your agent. This can include changing the model, adding tools, and uploading files to the agent. For more information on the Azure OpenAI models and non-Microsoft models that can be used in your deployment, view the [list of models supported by Azure AI Agent Service](https://learn.microsoft.com/azure/ai-services/agents/concepts/model-region-support).
 
-To change the model, set the following environment variables:
+To specify the model (e.g. gpt-4o-mini, gpt-4o) that is deployed for the agent when `azd up` is called, set the following environment variables:
 ```shell
-azd env set AZURE_AI_CHAT_DEPLOYMENT_NAME <MODEL_DEPLOYMENT_NAME>
-azd env set AZURE_AI_CHAT_MODEL_NAME <MODEL_DEPLOYMENT_NAME>
+azd env set AZURE_AI_AGENT_MODEL_NAME <MODEL_NAME>
+azd env set AZURE_AI_AGENT_MODEL_VERSION <MODEL_VERSION>
 ```
 To add tools, update the `agents.yaml` file located in the repository.
 ```python
diff --git a/azure.yaml b/azure.yaml
@@ -42,12 +42,12 @@ pipeline:
     - USE_CONTAINER_REGISTRY
     - USE_APPLICATION_INSIGHTS
     - USE_SEARCH_SERVICE
-    - AZURE_AI_CHAT_DEPLOYMENT_NAME
-    - AZURE_AI_CHAT_DEPLOYMENT_SKU
-    - AZURE_AI_CHAT_DEPLOYMENT_CAPACITY
-    - AZURE_AI_CHAT_MODEL_NAME
-    - AZURE_AI_CHAT_MODEL_FORMAT
-    - AZURE_AI_CHAT_MODEL_VERSION
+    - AZURE_AI_AGENT_DEPLOYMENT_NAME
+    - AZURE_AI_AGENT_DEPLOYMENT_SKU
+    - AZURE_AI_AGENT_DEPLOYMENT_CAPACITY
+    - AZURE_AI_AGENT_MODEL_NAME
+    - AZURE_AI_AGENT_MODEL_FORMAT
+    - AZURE_AI_AGENT_MODEL_VERSION
     - AZURE_AI_EMBED_DEPLOYMENT_NAME
     - AZURE_AI_EMBED_DEPLOYMENT_SKU
     - AZURE_AI_EMBED_DEPLOYMENT_CAPACITY
diff --git a/docs/deploy_customization.md b/docs/deploy_customization.md
@@ -43,44 +43,38 @@ To customize the model deployments, you can set the following environment variab
 
 ### Using a different chat model
 
-Change the chat deployment name:
+Change the agent model format (either OpenAI or Microsoft):
 
 ```shell
-azd env set AZURE_AI_CHAT_DEPLOYMENT_NAME Phi-3.5-MoE-instruct
+azd env set AZURE_AI_AGENT_MODEL_FORMAT Microsoft
 ```
 
-Change the chat model format (either OpenAI or Microsoft):
+Change the agent model name:
 
 ```shell
-azd env set AZURE_AI_CHAT_MODEL_FORMAT Microsoft
+azd env set AZURE_AI_AGENT_MODEL_NAME gpt-4o-mini
 ```
 
-Change the chat model name:
+Set the version of the agent model:
 
 ```shell
-azd env set AZURE_AI_CHAT_MODEL_NAME Phi-3.5-MoE-instruct
-```
-
-Set the version of the chat model:
-
-```shell
-azd env set AZURE_AI_CHAT_MODEL_VERSION 2
+azd env set AZURE_AI_AGENT_MODEL_VERSION 2024-07-18
 ```
 
 ### Setting capacity and deployment SKU
 
 For quota regions, you may find yourself needing to modify the default capacity and deployment SKU. The default tokens per minute deployed in this template is 50,000. 
 
-Change the capacity (in thousands of tokens per minute) of the chat deployment:
+Change the capacity (in thousands of tokens per minute) of the agent deployment:
 
 ```shell
-azd env set AZURE_AI_CHAT_DEPLOYMENT_CAPACITY 50
+azd env set AZURE_AI_AGENT_DEPLOYMENT_CAPACITY 50
 ```
 
-Change the SKU of the chat deployment:
+Change the SKU of the agent deployment:
 
 ```shell
-azd env set AZURE_AI_CHAT_DEPLOYMENT_SKU Standard
+azd env set AZURE_AI_AGENT_DEPLOYMENT_SKU Standard
 ```
 
 Change the capacity (in thousands of tokens per minute) of the embeddings deployment:
diff --git a/infra/api.bicep b/infra/api.bicep
@@ -8,7 +8,7 @@ param containerRegistryName string
 param serviceName string = 'api'
 param exists bool
 param projectConnectionString string
-param chatDeploymentName string
+param agentDeploymentName string
 
 resource apiIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
   name: identityName
@@ -25,8 +25,8 @@ var env = [
     value: projectConnectionString
   }
   {
-    name: 'AZURE_AI_CHAT_DEPLOYMENT_NAME'
-    value: chatDeploymentName
+    name: 'AZURE_AI_AGENT_DEPLOYMENT_NAME'
+    value: agentDeploymentName
   }
   {
     name: 'RUNNING_IN_PRODUCTION'
diff --git a/infra/core/ai/hub.bicep b/infra/core/ai/hub.bicep
@@ -14,8 +14,6 @@ param containerRegistryId string = ''
 param aiServicesName string
 @description('The AI Services connection name to use for the AI Foundry Hub Resource')
 param aiServicesConnectionName string
-// @description('The AI Services Content Safety connection name to use for the AI Foundry Hub Resource')
-// param aiServicesContentSafetyConnectionName string
 @description('The Azure Cognitive Search service name to use for the AI Foundry Hub Resource')
 param aiSearchName string = ''
 @description('The Azure Cognitive Search service connection name to use for the AI Foundry Hub Resource')
@@ -78,24 +76,6 @@ resource hub 'Microsoft.MachineLearningServices/workspaces@2024-07-01-preview' =
     }
   }
 
-  // resource contentSafetyConnection 'connections' = {
-  //   name: aiServicesContentSafetyConnectionName
-  //   properties: {
-  //     category: 'AzureOpenAI'
-  //     authType: 'ApiKey'
-  //     isSharedToAll: true
-  //     target: aiService.properties.endpoints['Content Safety']
-  //     metadata: {
-  //       ApiVersion: '2023-07-01-preview'
-  //       ApiType: 'azure'
-  //       ResourceId: aiService.id
-  //     }
-  //     credentials: {
-  //       key: aiService.listKeys().key1
-  //     }
-  //   }
-  // }
-
   resource searchConnection 'connections' =
     if (!empty(aiSearchName)) {
       name: aiSearchConnectionName
diff --git a/infra/core/host/ai-environment.bicep b/infra/core/host/ai-environment.bicep
@@ -16,8 +16,6 @@ param aiServicesName string
 param aiServicesConnectionName string
 @description('The AI Services model deployments.')
 param aiServiceModelDeployments array = []
-// @description('The AI Services content safety connection name.')
-// param aiServicesContentSafetyConnectionName string
 @description('The Log Analytics resource name.')
 param logAnalyticsName string = ''
 @description('The Application Insights resource name.')
@@ -59,7 +57,6 @@ module hub '../ai/hub.bicep' = {
     applicationInsightsId: hubDependencies.outputs.applicationInsightsId
     aiServicesName: hubDependencies.outputs.aiServicesName
     aiServicesConnectionName: aiServicesConnectionName
-  //   aiServicesContentSafetyConnectionName: aiServicesContentSafetyConnectionName
     aiSearchName: hubDependencies.outputs.searchServiceName
     aiSearchConnectionName: searchConnectionName
   }
diff --git a/infra/main.bicep b/infra/main.bicep
@@ -54,8 +54,6 @@ param applicationInsightsName string = ''
 param aiServicesName string = ''
 @description('The AI Services connection name. If ommited will use a default value')
 param aiServicesConnectionName string = ''
-// @description('The AI Services content safety connection name. If ommited will use a default value')
-// param aiServicesContentSafetyConnectionName string = ''
 @description('The Azure Container Registry resource name. If ommited will be generated')
 param containerRegistryName string = ''
 @description('The Azure Key Vault resource name. If ommited will be generated')
@@ -74,25 +72,25 @@ param principalId string = ''
 // Chat completion model
 @description('Format of the chat model to deploy')
 @allowed(['Microsoft', 'OpenAI'])
-param chatModelFormat string
+param agentModelFormat string
 
 @description('Name of the chat model to deploy')
-param chatModelName string
+param agentModelName string
 @description('Name of the model deployment')
-param chatDeploymentName string
+param agentDeploymentName string
 
 @description('Version of the chat model to deploy')
 // See version availability in this table:
 // https://learn.microsoft.com/azure/ai-services/openai/concepts/models#global-standard-model-availability
-param chatModelVersion string
+param agentModelVersion string
 
 @description('Sku of the chat deployment')
-param chatDeploymentSku string
+param agentDeploymentSku string
 
 @description('Capacity of the chat deployment')
 // You can increase this, but capacity is limited per model/region, so you will get errors if you go over
 // https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits
-param chatDeploymentCapacity int
+param agentDeploymentCapacity int
 
 // Embedding model
 @description('Format of the embedding model to deploy')
@@ -129,15 +127,15 @@ var tags = { 'azd-env-name': environmentName }
 
 var aiDeployments = [
   {
-    name: chatDeploymentName
+    name: agentDeploymentName
     model: {
-      format: chatModelFormat
-      name: chatModelName
-      version: chatModelVersion
+      format: agentModelFormat
+      name: agentModelName
+      version: agentModelVersion
     }
     sku: {
-      name: chatDeploymentSku
-      capacity: chatDeploymentCapacity
+      name: agentDeploymentSku
+      capacity: agentDeploymentCapacity
     }
   }
   {
@@ -188,9 +186,6 @@ module ai 'core/host/ai-environment.bicep' = if (empty(aiExistingProjectConnecti
       : '${abbrs.storageStorageAccounts}${resourceToken}'
     aiServicesName: !empty(aiServicesName) ? aiServicesName : 'aoai-${resourceToken}'
     aiServicesConnectionName: !empty(aiServicesConnectionName) ? aiServicesConnectionName : 'aoai-${resourceToken}'
-    // aiServicesContentSafetyConnectionName: !empty(aiServicesContentSafetyConnectionName)
-    //   ? aiServicesContentSafetyConnectionName
-    //   : 'aoai-content-safety-connection'
     aiServiceModelDeployments: aiDeployments
     logAnalyticsName: logAnalyticsWorkspaceResolvedName
     applicationInsightsName: !useApplicationInsights
@@ -333,7 +328,7 @@ module api 'api.bicep' = {
     containerAppsEnvironmentName: containerApps.outputs.environmentName
     containerRegistryName: containerApps.outputs.registryName
     projectConnectionString: projectConnectionString
-    chatDeploymentName: chatDeploymentName
+    agentDeploymentName: agentDeploymentName
     exists: apiAppExists
   }
 }
@@ -343,7 +338,7 @@ output AZURE_RESOURCE_GROUP string = rg.name
 // Outputs required for local development server
 output AZURE_TENANT_ID string = tenant().tenantId
 output AZURE_AIPROJECT_CONNECTION_STRING string = projectConnectionString
-output AZURE_AI_CHAT_DEPLOYMENT_NAME string = chatDeploymentName
+output AZURE_AI_AGENT_DEPLOYMENT_NAME string = agentDeploymentName
 
 // Outputs required by azd for ACA
 output AZURE_CONTAINER_ENVIRONMENT_NAME string = containerApps.outputs.environmentName
diff --git a/infra/main.parameters.json b/infra/main.parameters.json
@@ -50,23 +50,23 @@
     "useSearchService": {
       "value": "${USE_SEARCH_SERVICE=true}"
     },
-    "chatDeploymentName": {
-      "value": "${AZURE_AI_CHAT_DEPLOYMENT_NAME=gpt-4o-mini}"
+    "agentDeploymentName": {
+      "value": "${AZURE_AI_DEPLOYMENT_NAME=${AZURE_AI_AGENT_MODEL_NAME=gpt-4o-mini}}"
     },
-    "chatModelFormat": {
-      "value": "${AZURE_AI_CHAT_MODEL_FORMAT=OpenAI}"
+    "agentModelFormat": {
+      "value": "${AZURE_AI_AGENT_MODEL_FORMAT=OpenAI}"
     },
-    "chatModelName": {
-      "value": "${AZURE_AI_CHAT_MODEL_NAME=gpt-4o-mini}"
+    "agentModelName": {
+      "value": "${AZURE_AI_AGENT_MODEL_NAME=gpt-4o-mini}"
     },
-    "chatModelVersion": {
-      "value": "${AZURE_AI_CHAT_MODEL_VERSION=2024-07-18}"
+    "agentModelVersion": {
+      "value": "${AZURE_AI_AGENT_MODEL_VERSION=2024-07-18}"
     },
-    "chatDeploymentSku": {
-      "value": "${AZURE_AI_CHAT_DEPLOYMENT_SKU=GlobalStandard}"
+    "agentDeploymentSku": {
+      "value": "${AZURE_AI_AGENT_DEPLOYMENT_SKU=GlobalStandard}"
     },
-    "chatDeploymentCapacity": {
-      "value": "${AZURE_AI_CHAT_DEPLOYMENT_CAPACITY=50}"
+    "agentDeploymentCapacity": {
+      "value": "${AZURE_AI_AGENT_DEPLOYMENT_CAPACITY=50}"
     },
     "embedDeploymentName": {
       "value": "${AZURE_AI_EMBED_DEPLOYMENT_NAME=text-embedding-ada-002}"
diff --git a/scripts/write_env.ps1 b/scripts/write_env.ps1
@@ -6,11 +6,11 @@ Set-Content -Path $envFilePath -Value ""
 
 # Append new values to the .env file
 $azureAiProjectConnectionString = azd env get-value AZURE_AIPROJECT_CONNECTION_STRING
-$azureAiChatDeploymentName = azd env get-value AZURE_AI_CHAT_DEPLOYMENT_NAME
+$azureAiagentDeploymentName = azd env get-value AZURE_AI_AGENT_DEPLOYMENT_NAME
 $azureAiAgentId = azd env get-value AZURE_AI_AGENT_ID
 $azureTenantId = azd env get-value AZURE_TENANT_ID
 
 Add-Content -Path $envFilePath -Value "AZURE_AIPROJECT_CONNECTION_STRING=$azureAiProjectConnectionString"
-Add-Content -Path $envFilePath -Value "AZURE_AI_CHAT_DEPLOYMENT_NAME=$azureAiChatDeploymentName"
+Add-Content -Path $envFilePath -Value "AZURE_AI_AGENT_DEPLOYMENT_NAME=$azureAiagentDeploymentName"
 Add-Content -Path $envFilePath -Value "AZURE_AI_AGENT_ID=$azureAiAgentId"
 Add-Content -Path $envFilePath -Value "AZURE_TENANT_ID=$azureTenantId"
diff --git a/scripts/write_env.sh b/scripts/write_env.sh
@@ -7,6 +7,6 @@ ENV_FILE_PATH="src/.env"
 > $ENV_FILE_PATH
 
 echo "AZURE_AIPROJECT_CONNECTION_STRING=$(azd env get-value AZURE_AIPROJECT_CONNECTION_STRING)" >> $ENV_FILE_PATH
-echo "AZURE_AI_CHAT_DEPLOYMENT_NAME=$(azd env get-value AZURE_AI_CHAT_DEPLOYMENT_NAME)" >> $ENV_FILE_PATH
+echo "AZURE_AI_AGENT_DEPLOYMENT_NAME=$(azd env get-value AZURE_AI_AGENT_DEPLOYMENT_NAME)" >> $ENV_FILE_PATH
 echo "AZURE_AI_AGENT_ID=$(azd env get-value AZURE_AI_AGENT_ID)" >> $ENV_FILE_PATH
 echo "AZURE_TENANT_ID=$(azd env get-value AZURE_TENANT_ID)" >> $ENV_FILE_PATH
diff --git a/src/api/main.py b/src/api/main.py
@@ -117,7 +117,7 @@ async def lifespan(app: fastapi.FastAPI):
             toolset.add(file_search_tool)
 
             agent = await ai_client.agents.create_agent(
-                model=os.environ["AZURE_AI_CHAT_DEPLOYMENT_NAME"],
+                model=os.environ["AZURE_AI_AGENT_DEPLOYMENT_NAME"],
                 name="my-assistant", 
                 instructions="You are helpful assistant",
                 toolset=toolset

Original file line number	Diff line number	Diff line change
`@@ -8,7 +8,7 @@ param containerRegistryName string`
`8`	`8`	`param serviceName string = 'api'`
`9`	`9`	`param exists bool`
`10`	`10`	`param projectConnectionString string`
`11`		`-param chatDeploymentName string`
	`11`	`+param agentDeploymentName string`
`12`	`12`
`13`	`13`	`resource apiIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {`
`14`	`14`	`name: identityName`
`@@ -25,8 +25,8 @@ var env = [`
`25`	`25`	`value: projectConnectionString`
`26`	`26`	`}`
`27`	`27`	`{`
`28`		`- name: 'AZURE_AI_CHAT_DEPLOYMENT_NAME'`
`29`		`- value: chatDeploymentName`
	`28`	`+ name: 'AZURE_AI_AGENT_DEPLOYMENT_NAME'`
	`29`	`+ value: agentDeploymentName`
`30`	`30`	`}`
`31`	`31`	`{`
`32`	`32`	`name: 'RUNNING_IN_PRODUCTION'`