diff --git a/ai-quick-actions/model-deployment-tips.md b/ai-quick-actions/model-deployment-tips.md
index 90547d3e..05a6b875 100644
--- a/ai-quick-actions/model-deployment-tips.md
+++ b/ai-quick-actions/model-deployment-tips.md
@@ -9,6 +9,8 @@ Table of Contents:
 - [Model Evaluation](evaluation-tips.md)
 - [Model Registration](register-tips.md)
 - [Multi Modal Inferencing](multimodal-models-tips.md)
+- [Multi Model Inferencing](multimodal-models-tips.md)
+- [Stacked Model Inferencing](stacked-deployment-tips.md)
 - [Private_Endpoints](model-deployment-private-endpoint-tips.md)
 - [Tool Calling](model-deployment-tool-calling-tips.md)
 
@@ -918,4 +920,4 @@ Table of Contents:
 - [Model Registration](register-tips.md)
 - [Multi Modal Inferencing](multimodal-models-tips.md)
 - [Private_Endpoints](model-deployment-private-endpoint-tips.md)
-- [Tool Calling](model-deployment-tool-calling-tips.md)
\ No newline at end of file
+- [Tool Calling](model-deployment-tool-calling-tips.md)
diff --git a/ai-quick-actions/multimodel-deployment-tips.md b/ai-quick-actions/multimodel-deployment-tips.md
index 183afc03..38e415ad 100644
--- a/ai-quick-actions/multimodel-deployment-tips.md
+++ b/ai-quick-actions/multimodel-deployment-tips.md
@@ -63,6 +63,8 @@ For fine-tuned models, requests specifying the base model name (ex. model: meta-
         - [CLI Output](#cli-output-3)
       - [Create Multi-Model (1 Embedding Model, 1 LLM) deployment with `/v1/completions`](#create-multi-model-1-embedding-model-1-llm-deployment-with-v1completions)
   - [Manage Multi-Model Deployments](#manage-multi-model-deployments)
+    - [List Multi-Model Deployments](#list-multi-model-deployments)
+    - [Edit Multi-Model Deployments](#edit-multi-model-deployments)
 - [Multi-Model Inferencing](#multi-model-inferencing)
   - [Using oci-cli](#using-oci-cli)
   - [Using Python SDK (without streaming)](#using-python-sdk-without-streaming)
@@ -101,16 +103,22 @@ Only Multi-Model Deployments with **base service LLM models (text-generation)**
 
 ### Select 'Deploy Multi Model'
 - Based on the 'models' field, a Compute Shape will be recommended to accomidate both models.
+- Select the 'Fine Tuned Weights'.
+  - Only fine tuned model with version `V2` is allowed to be deployed as weights in Multi Deployment. For deploying old fine tuned model weight, run the following command to convert it to version `V2` and apply the new fine tuned model to the deployment creation. This command deletes the old fine tuned model by default after conversion but you can add ``--delete_model False`` to keep it instead.
+
+  ```bash
+  ads aqua model convert_fine_tune --model_id [FT_OCID]
+  ```
 - Select logging and endpoints (/v1/completions | /v1/chat/completions).
 - Submit form via 'Deploy Button' at bottom.
-![mmd-form](web_assets/deploy-mmd.png)
+![mmd-form](web_assets/deploy-multi.png)
 
 ### Inferencing with Multi-Model Deployment
 
 There are two ways to send inference requests to models within a Multi-Model Deployment
 
 1. Python SDK (recommended)- see [here](#Multi-Model-Inferencing)
-2. Using AQUA UI (see below, ok for testing)
+2. Using AQUA UI - see [here](#using-aqua-ui-interface-for-multi-model-deployment)
 
 Once the Deployment is Active, view the model deployment details and inferencing form by clicking on the 'Deployments' Tab and selecting the model within the Model Deployment list.
 
@@ -472,8 +480,13 @@ ads aqua deployment get_multimodel_deployment_config --model_ids '["ocid1.datasc
 
 ## 3. Create Multi-Model Deployment
 
-Only **base service LLM models** are supported for MultiModel Deployment. All selected models will run on the same **GPU shape**, sharing the available compute resources. Make sure to choose a shape that meets the needs of all models in your deployment using [MultiModel Configuration command](#get-multimodel-configuration)
+All selected models will run on the same **GPU shape**, sharing the available compute resources. Make sure to choose a shape that meets the needs of all models in your deployment using [MultiModel Configuration command](#get-multimodel-configuration)
+
+Only fine tuned model with version `V2` is allowed to be deployed as weights in Multi Deployment. For deploying old fine tuned model weight, run the following command to convert it to version `V2` and apply the new fine tuned model OCID to the deployment creation. This command deletes the old fine tuned model by default after conversion but you can add ``--delete_model False`` to keep it instead.
 
+```bash
+ads aqua model convert_fine_tune --model_id [FT_OCID]
+```
 
 ### Description
 
@@ -750,6 +763,144 @@ To list all AQUA deployments (both Multi-Model and single-model) within a specif
 
 Note: Multi-Model deployments are identified by the tag `"aqua_multimodel": "true",` associated with them.
 
+### Edit Multi-Model Deployments
+
+AQUA deployment must be in `ACTIVE` state to be updated and can only be updated one at a time for the following option groups. There are two ways to update model deployment: `ZDT` and `LIVE`. The default update type for AQUA deployment is `ZDT` but `LIVE` will be adopted if `models` are changed in multi deployment.
+
+  - `Name or description`: Change the name or description.
+  - `Default configuration`: Change or add freeform and defined tags.
+  - `Models`: Change the model.
+  - `Compute`: Change the number of CPUs or amount of memory for each CPU in gigabytes.
+  - `Logging`: Change the logging configuration for access and predict logs.
+  - `Load Balancer`: Change the load balancing bandwidth.
+
+#### Usage
+
+```bash
+ads aqua deployment update [OPTIONS]
+```
+
+#### Required Parameters
+
+`--model_deployment_id [str]`
+
+The model deployment OCID to be updated.
+
+#### Optional Parameters
+
+`--models [str]`
+
+The String representation of a JSON array, where each object defines a model’s OCID and the number of GPUs assigned to it. The gpu count should always be a **power of two (e.g., 1, 2, 4, 8)**. <br>
+Example: `'[{"model_id":"<model_ocid>", "gpu_count":1},{"model_id":"<model_ocid>", "gpu_count":1}]'` for  `VM.GPU.A10.2` shape. <br>
+
+`--display_name [str]`
+
+The name of model deployment.
+
+`--description [str]`
+
+The description of the model deployment. Defaults to None.
+
+`--instance_count [int]`
+
+The number of instance used for model deployment. Defaults to 1.
+
+`--log_group_id [str]`
+
+The oci logging group id. The access log and predict log share the same log group.
+
+`--access_log_id [str]`
+
+The access log OCID for the access logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--predict_log_id [str]`
+
+The predict log OCID for the predict logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--web_concurrency [int]`
+
+The number of worker processes/threads to handle incoming requests.
+
+`--bandwidth_mbps [int]`
+
+The bandwidth limit on the load balancer in Mbps.
+
+`--memory_in_gbs [float]`
+
+Memory (in GB) for the selected shape.
+
+`--ocpus [float]`
+
+OCPU count for the selected shape.
+
+`--freeform_tags [dict]`
+
+Freeform tags for model deployment.
+
+`--defined_tags [dict]`
+Defined tags for model deployment.
+
+#### Example
+
+##### Edit Multi-Model deployment with `/v1/completions`
+
+```bash
+ads aqua deployment update \
+  --model_deployment_id "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>" \
+  --models '[{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_updated_model_name", "gpu_count":2}]' \
+  --display_name "updated_modelDeployment_multmodel_model1_model2"
+
+```
+
+##### CLI Output
+
+```json
+{
+    "id": "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "display_name": "updated_modelDeployment_multmodel_model1_model2",
+    "aqua_service_model": false,
+    "model_id": "ocid1.datasciencemodelgroup.oc1.iad.<ocid>",
+    "models": [
+        {
+            "model_id": "ocid1.datasciencemodel.oc1.iad.<ocid>",
+            "model_name": "mistralai/Mistral-7B-v0.1",
+            "gpu_count": 1,
+            "env_var": {}
+        },
+        {
+            "model_id": "ocid1.datasciencemodel.oc1.iad.<ocid>",
+            "model_name": "tiiuae/falcon-7b",
+            "gpu_count": 1,
+            "env_var": {}
+        }
+    ],
+    "aqua_model_name": "",
+    "state": "UPDATING",
+    "description": null,
+    "created_on": "2025-03-10 19:09:40.793000+00:00",
+    "created_by": "ocid1.user.oc1..<ocid>",
+    "endpoint": "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "private_endpoint_id": null,
+    "console_link": "https://cloud.oracle.com/data-science/model-deployments/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "lifecycle_details": null,
+    "shape_info": {
+        "instance_shape": "VM.GPU.A10.2",
+        "instance_count": 1,
+        "ocpus": null,
+        "memory_in_gbs": null
+    },
+    "tags": {
+        "aqua_model_id": "ocid1.datasciencemodelgroup.oc1.<ocid>",
+        "aqua_multimodel": "true",
+        "OCI_AQUA": "active"
+    },
+    "environment_variables": {
+        "MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/chat/completions",
+        "MODEL_DEPLOY_ENABLE_STREAMING": "true",
+    },
+}
+```
+
 # Multi-Model Inferencing
 
 The only change required to infer a specific model from a Multi-Model deployment is to update the value of `"model"` parameter in the request payload. The values for this parameter can be found in the Model Deployment details, under the field name `"model_name"`. This parameter segregates the request flow, ensuring that the inference request is directed to the correct model within the MultiModel deployment.
diff --git a/ai-quick-actions/stacked-deployment-tips.md b/ai-quick-actions/stacked-deployment-tips.md
new file mode 100644
index 00000000..0e13ccee
--- /dev/null
+++ b/ai-quick-actions/stacked-deployment-tips.md
@@ -0,0 +1,837 @@
+# **AI Quick Actions Stacked Deployment**
+
+# Table of Contents
+- # Introduction to Stacked Deployment and Serving
+- [Models](#models)
+  - [Fine Tuned Models](#fine-tuned-models)
+- [Stacked Deployment](#stacked-deployment)
+  - [Create Stacked Deployment via AQUA UI](#create-stacked-deployment-via-aqua-ui)
+  - [Create Stacked Deployment via ADS CLI](#create-stacked-deployment-via-ads-cli)
+  - [Manage Stacked Deployments](#manage-stacked-deployments)
+    - [List Stacked Deployments](#list-stacked-deployments)
+    - [Edit Stacked Deployments](#edit-stacked-deployments)
+- [Stacked Model Inferencing](#stacked-model-inferencing)
+- [Stacked Model Evaluation](#stacked-model-evaluation)
+  - [Create Model Evaluations](#create-model-evaluations)
+
+# Introduction to Stacked Deployment and Serving
+
+Stacked Model Deployment enables deploying a base model alongside multiple fine-tuned weights within the same deployment. During inference, responses can be generated using either the base model or the associated fine-tuned weights, depending on the request. The Data Science server has prebuilt **vLLM service container** that make deploying and serving stacked large language model very easy, simplifying the deployment process and reducing operational complexity. This container comes with **VLLM's native routing** which routes requests to the appropriate model, ensuring seamless prediction.
+
+This document provides documentation on how to create stacked deployment using AI Quick Actions (AQUA) model deployments, and evaluate the models.
+
+# Models
+
+First step in process is to get the OCIDs of the desired base service LLM AQUA models, which are required to initiate the stacked deployment process. Refer to [AQUA CLI tips](cli-tips.md) for detailed instructions on how to obtain the OCIDs of base service LLM AQUA models.
+
+You can also obtain the OCID from the AQUA user interface by clicking on the model card and selecting the `Copy OCID` button from the `More Options` dropdown in the top-right corner of the screen.
+
+## Fine Tuned Models
+
+Only fine tuned model with version `V2` is allowed to be deployed as weights in Stacked Deployment. For deploying old fine tuned model weight, run the following command to convert it to version `V2` and apply the new fine tuned model OCID to the deployment creation. This command deletes the old fine tuned model by default after conversion but you can add ``--delete_model False`` to keep it instead.
+
+```bash
+ads aqua model convert_fine_tune --model_id [FT_OCID]
+```
+
+If fine tuned model `V2` is deployed as single deployment, AQUA will fetch its base model, attach it as weight and deploy them as stack deployment instead.
+
+# Stacked Deployment
+
+## Create Stacked Deployment via AQUA UI
+
+### Create Stack Deployment
+
+Open AQUA UI and navigate to the `Deployments` tab. Click `Create Deployment` on the upper right and you should see the following page. Select `Deploy Model Stack` and select the service model and its corresponding fine tuned weights. You can customize the inference keys for each service and fine tuned model.
+
+![Deploy Model](web_assets/deploy-stack.png)
+
+### Compute Shape
+
+The compute shape selection is critical, the list available is selected to be suitable for the
+chosen model.
+
+- VM.GPU.A10.1 has 24GB of GPU memory and 240GB of CPU memory. The limiting factor is usually the
+GPU memory which needs to be big enough to hold the model.
+- VM.GPU.A10.2 has 48GB GPU memory
+- BM.GPU.A10.4 has 96GB GPU memory and runs on a bare metal machine, rather than a VM.
+
+For a full list of shapes and their definitions see the [compute shape docs](https://docs.oracle.com/en-us/iaas/Content/Compute/References/computeshapes.htm)
+
+The relationship between model parameter size and GPU memory is roughly 2x parameter count in GB, so for example a model that has 7B parameters will need a minimum of 14 GB for inference. At runtime the
+memory is used for both holding the weights, along with the concurrent contexts for the user's requests.
+
+### Advanced Options
+
+You may click on the "Show Advanced Options" to configure options for "inference container".
+
+![Advanced Options](web_assets/deploy-stack-model-advanced-options.png)
+
+### Inference Container Configuration
+
+The service allows for model deployment configuration to be overridden when creating a model deployment. Depending on
+the type of inference container used for deployment, i.e. vLLM or TGI, the parameters vary and need to be passed with the format
+`(--param-name, param-value)`.
+
+For more details, please visit [vLLM](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#command-line-arguments-for-the-server) documentation to know more about the parameters accepted by the respective containers.
+
+## Create Stacked Deployment via ADS CLI
+
+### Description
+
+You'll need the latest version of ADS to create a new Aqua Stacked deployment. Installation instructions are available [here](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/quickstart.html).
+
+### Usage
+
+```bash
+ads aqua deployment create [OPTIONS]
+```
+
+### Required Parameters
+
+`--models [str]`
+
+The String representation of a JSON array, where each object defines a model OCID, model name and its associating fine tuned weights. The model names are used to reference specific models during inference requests and support a [maximum length of 32 characters](https://docs.oracle.com/en-us/iaas/Content/data-science/using/models-mms-top.htm#models-mms-key-concepts). Model OCID will be used for inferencing if no model name is provided. Only **one** base model is allowed for creating stacked deployment <br>
+Example: `'[{"model_id":"<model_ocid>", "model_name":"<model_name>", "fine_tune_weights": [{"model_id": "<ft_ocid>", "model_name":"<ft_name>"},{"model_id":"<ft_ocid>", "model_name": "<ft_name>"}]}]'` for  `VM.GPU.A10.2` shape. <br>
+
+
+`--instance_shape [str]`
+
+The shape (GPU) of the instance used for model deployment. <br>
+Example: `VM.GPU.A10.2, BM.GPU.A10.4, BM.GPU4.8, BM.GPU.A100-v2.8`.
+
+`--display_name [str]`
+
+The name of model deployment.
+
+`--container_image_uri [str]`
+
+The URI of the inference container associated with the model being registered. In case of Stacked, the value is vLLM container URI. <br>
+Example: `dsmc://odsc-vllm-serving:0.6.4.post1.2` or `dsmc://odsc-vllm-serving:0.8.1.2`
+
+`--deployment_type [str]`
+
+The deployment type for creating model deployment. In case of Stacked, the value must be `STACKED`. Failing to provide `--deployment_type` will result in creating multi model deployment instead.
+
+### Optional Parameters
+
+`--compartment_id [str]`
+
+The compartment OCID where model deployment is to be created. If not provided, then it defaults to user's compartment.
+
+`--project_id [str]`
+
+The project OCID where model deployment is to be created. If not provided, then it defaults to user's project.
+
+`--description [str]`
+
+The description of the model deployment. Defaults to None.
+
+`--instance_count [int]`
+
+The number of instance used for model deployment. Defaults to 1.
+
+`--log_group_id [str]`
+
+The oci logging group id. The access log and predict log share the same log group.
+
+`--access_log_id [str]`
+
+The access log OCID for the access logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--predict_log_id [str]`
+
+The predict log OCID for the predict logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--web_concurrency [int]`
+
+The number of worker processes/threads to handle incoming requests.
+
+`--server_port [int]`
+
+The server port for docker container image. Defaults to 8080.
+
+`--health_check_port [int]`
+
+The health check port for docker container image. Defaults to 8080.
+
+`--env_var [dict]`
+
+Environment variable for the model deployment, defaults to None.
+
+`--private_endpoint_id [str]`
+
+The private endpoint id of model deployment.
+
+### Example
+
+#### Create Stacked deployment with `/v1/completions`
+
+```bash
+ads aqua deployment create \
+  --container_image_uri "dsmc://odsc-vllm-serving:0.6.4.post1.2" \
+  --models '[{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_model_name", "fine_tune_weights": [{"model_id": "ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_ft_name_one"},{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name": "test_ft_name_two"}]}]' \
+  --instance_shape "VM.GPU.A10.1" \
+  --display_name "modelDeployment_stacked_model"
+  --deployment_type "STACKED"
+
+```
+
+##### CLI Output
+
+```json
+{
+    "id": "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "display_name": "modelDeployment_stacked_model",
+    "aqua_service_model": false,
+    "model_id": "ocid1.datasciencemodelgroup.oc1.iad.<ocid>",
+    "models": [],
+    "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+    "state": "CREATING",
+    "description": null,
+    "created_on": "2025-10-13 17:48:53.416000+00:00",
+    "created_by": "ocid1.user.oc1..<ocid>",
+    "endpoint": "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "private_endpoint_id": null,
+    "console_link": "https://cloud.oracle.com/data-science/model-deployments/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "lifecycle_details": null,
+    "shape_info": {
+        "instance_shape": "VM.GPU.A10.1",
+        "instance_count": 1,
+        "ocpus": null,
+        "memory_in_gbs": null
+    },
+    "tags": {
+        "task": "text_generation",
+        "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+        "OCI_AQUA": "active"
+    },
+    "environment_variables": {
+        "BASE_MODEL": "service_models/Meta-Llama-3.1-8B-Instruct/5206a32/artifact",
+        "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true",
+        "MODEL": "/opt/ds/model/deployed_model/ocid1.datasciencemodel.oc1.iad.<ocid>/",
+        "PARAMS": "--served-model-name test_model_name --disable-custom-all-reduce --seed 42 --max-model-len 4096 --max-lora-rank 32 --enable_lora",
+        "MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/completions",
+        "MODEL_DEPLOY_ENABLE_STREAMING": "true",
+        "PORT": "8080",
+        "HEALTH_CHECK_PORT": "8080",
+        "AQUA_TELEMETRY_BUCKET_NS": "ociodscdev",
+        "AQUA_TELEMETRY_BUCKET": "service-managed-models"
+    },
+    "cmd": []
+}
+```
+
+#### Create Stacked deployment with `/v1/chat/completions`
+
+```bash
+ads aqua deployment create \
+  --container_image_uri "dsmc://odsc-vllm-serving:0.6.4.post1.2" \
+  --models '[{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_model_name", "fine_tune_weights": [{"model_id": "ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_ft_name_one"},{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name": "test_ft_name_two"}]}]' \
+  --env-var '{"MODEL_DEPLOY_PREDICT_ENDPOINT":"/v1/chat/completions"}' \
+  --instance_shape "VM.GPU.A10.1" \
+  --display_name "modelDeployment_stacked_model"
+  --deployment_type "STACKED"
+
+```
+
+##### CLI Output
+
+```json
+{
+    "id": "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "display_name": "modelDeployment_stacked_model",
+    "aqua_service_model": false,
+    "model_id": "ocid1.datasciencemodelgroup.oc1.iad.<ocid>",
+    "models": [],
+    "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+    "state": "CREATING",
+    "description": null,
+    "created_on": "2025-10-13 17:48:53.416000+00:00",
+    "created_by": "ocid1.user.oc1..<ocid>",
+    "endpoint": "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "private_endpoint_id": null,
+    "console_link": "https://cloud.oracle.com/data-science/model-deployments/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "lifecycle_details": null,
+    "shape_info": {
+        "instance_shape": "VM.GPU.A10.1",
+        "instance_count": 1,
+        "ocpus": null,
+        "memory_in_gbs": null
+    },
+    "tags": {
+        "task": "text_generation",
+        "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+        "OCI_AQUA": "active"
+    },
+    "environment_variables": {
+        "BASE_MODEL": "service_models/Meta-Llama-3.1-8B-Instruct/5206a32/artifact",
+        "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true",
+        "MODEL": "/opt/ds/model/deployed_model/ocid1.datasciencemodel.oc1.iad.<ocid>/",
+        "PARAMS": "--served-model-name test_model_name --disable-custom-all-reduce --seed 42 --max-model-len 4096 --max-lora-rank 32 --enable_lora",
+        "MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/chat/completions",
+        "MODEL_DEPLOY_ENABLE_STREAMING": "true",
+        "PORT": "8080",
+        "HEALTH_CHECK_PORT": "8080",
+        "AQUA_TELEMETRY_BUCKET_NS": "ociodscdev",
+        "AQUA_TELEMETRY_BUCKET": "service-managed-models"
+    },
+    "cmd": []
+}
+```
+
+## Manage Stacked Deployments
+
+### List Stacked Deployments
+
+To list all AQUA deployments (all Stacked, MultiModel and single-model) within a specified compartment or project, or to get detailed information on a specific Stacked deployment, kindly refer to the [AQUA CLI tips](cli-tips.md) documentation.
+
+Note: Stacked deployments are identified by the tag `"aqua_stacked_model": "true",` associated with them.
+
+### Edit Stacked Deployments
+
+AQUA deployment must be in `ACTIVE` state to be updated and can only be updated one at a time for the following option groups. There are two ways to update model deployment: `ZDT` and `LIVE`. The default update type for AQUA deployment is `ZDT` but `LIVE` will be adopted if `models` are changed in stacked deployment.
+
+  - `Name or description`: Change the name or description.
+  - `Default configuration`: Change or add freeform and defined tags.
+  - `Models`: Change the model.
+  - `Compute`: Change the number of CPUs or amount of memory for each CPU in gigabytes.
+  - `Logging`: Change the logging configuration for access and predict logs.
+  - `Load Balancer`: Change the load balancing bandwidth.
+
+#### Usage
+
+```bash
+ads aqua deployment update [OPTIONS]
+```
+
+#### Required Parameters
+
+`--model_deployment_id [str]`
+
+The model deployment OCID to be updated.
+
+#### Optional Parameters
+
+`--models [str]`
+
+The String representation of a JSON array, where each object defines a model OCID, model name and its associating fine tuned weights. The model names are used to reference specific models during inference requests and support a [maximum length of 32 characters](https://docs.oracle.com/en-us/iaas/Content/data-science/using/models-mms-top.htm#models-mms-key-concepts). Only **one** base model is allowed for updating stacked deployment <br>
+Example: `'[{"model_id":"<model_ocid>", "model_name":"<model_name>", "fine_tune_weights": [{"model_id": "<ft_ocid>", "model_name":"<ft_name>"},{"model_id":"<ft_ocid>", "model_name": "<ft_name>"}]}]'` for  `VM.GPU.A10.2` shape. <br>
+
+`--display_name [str]`
+
+The name of model deployment.
+
+`--description [str]`
+
+The description of the model deployment. Defaults to None.
+
+`--instance_count [int]`
+
+The number of instance used for model deployment. Defaults to 1.
+
+`--log_group_id [str]`
+
+The oci logging group id. The access log and predict log share the same log group.
+
+`--access_log_id [str]`
+
+The access log OCID for the access logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--predict_log_id [str]`
+
+The predict log OCID for the predict logs. Check [model deployment logging](https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_using_logging.htm) for more details.
+
+`--web_concurrency [int]`
+
+The number of worker processes/threads to handle incoming requests.
+
+`--bandwidth_mbps [int]`
+
+The bandwidth limit on the load balancer in Mbps.
+
+`--memory_in_gbs [float]`
+
+Memory (in GB) for the selected shape.
+
+`--ocpus [float]`
+
+OCPU count for the selected shape.
+
+`--freeform_tags [dict]`
+
+Freeform tags for model deployment.
+
+`--defined_tags [dict]`
+Defined tags for model deployment.
+
+#### Example
+
+##### Edit Stacked deployment with `/v1/completions`
+
+```bash
+ads aqua deployment update \
+  --model_deployment_id "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>" \
+  --models '[{"model_id":"ocid1.datasciencemodel.oc1.iad.<ocid>", "model_name":"test_updated_model_name"}]' \
+  --display_name "updated_modelDeployment_stacked_model"
+
+```
+
+##### CLI Output
+
+```json
+{
+    "id": "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "display_name": "updated_modelDeployment_stacked_model",
+    "aqua_service_model": false,
+    "model_id": "ocid1.datasciencemodelgroup.oc1.iad.<ocid>",
+    "models": [],
+    "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+    "state": "UPDATING",
+    "description": null,
+    "created_on": "2025-10-13 17:48:53.416000+00:00",
+    "created_by": "ocid1.user.oc1..<ocid>",
+    "endpoint": "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "private_endpoint_id": null,
+    "console_link": "https://cloud.oracle.com/data-science/model-deployments/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "lifecycle_details": null,
+    "shape_info": {
+        "instance_shape": "VM.GPU.A10.1",
+        "instance_count": 1,
+        "ocpus": null,
+        "memory_in_gbs": null
+    },
+    "tags": {
+        "task": "text_generation",
+        "aqua_model_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
+        "OCI_AQUA": "active"
+    },
+    "environment_variables": {
+        "BASE_MODEL": "service_models/Meta-Llama-3.1-8B-Instruct/5206a32/artifact",
+        "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true",
+        "MODEL": "/opt/ds/model/deployed_model/ocid1.datasciencemodel.oc1.iad.<ocid>/",
+        "PARAMS": "--served-model-name test_updated_model_name --disable-custom-all-reduce --seed 42 --max-model-len 4096 --max-lora-rank 32 --enable_lora",
+        "MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/completions",
+        "MODEL_DEPLOY_ENABLE_STREAMING": "true",
+        "PORT": "8080",
+        "HEALTH_CHECK_PORT": "8080",
+        "AQUA_TELEMETRY_BUCKET_NS": "ociodscdev",
+        "AQUA_TELEMETRY_BUCKET": "service-managed-models"
+    },
+    "cmd": []
+}
+```
+
+# Stacked Model Inferencing
+
+The only change required to infer a specific model from a Stacked deployment is to update the value of `"model"` parameter in the request payload. The values for this parameter can be found in the Model Deployment details, under the field name `"model_name"`. This parameter segregates the request flow, ensuring that the inference request is directed to the correct model within the Stacked deployment.
+
+## Using AQUA UI
+
+![Inferencing](web_assets/try-stack-model.png)
+
+## Using oci-cli
+
+```bash
+oci raw-request \
+  --http-method POST \
+  --target-uri <model_deployment_url>/predict \
+  --request-body '{
+    "model": "<model_name>",
+    "prompt": "what are activation functions?",
+    "max_tokens": 250,
+    "temperature": 0.7,
+    "top_p": 0.8
+  }' \
+  --auth <auth_method>
+
+```
+
+Note: Currently `oci-cli` does not support streaming response, use Python or Java SDK instead.
+
+## Using Python SDK (without streaming)
+
+```python
+# The OCI SDK must be installed for this example to function properly.
+# Installation instructions can be found here: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/pythonsdk.htm
+
+import requests
+import oci
+from oci.signer import Signer
+from oci.config import from_file
+
+config = from_file('~/.oci/config')
+auth = Signer(
+    tenancy=config['tenancy'],
+    user=config['user'],
+    fingerprint=config['fingerprint'],
+    private_key_file_location=config['key_file'],
+    pass_phrase=config['pass_phrase']
+)
+
+# For security token based authentication
+# token_file = config['security_token_file']
+# token = None
+# with open(token_file, 'r') as f:
+#     token = f.read()
+# private_key = oci.signer.load_private_key_from_file(config['key_file'])
+# auth = oci.auth.signers.SecurityTokenSigner(token, private_key)
+
+model = "<model_name>"
+
+endpoint = "https://modeldeployment.us-ashburn-1.oci.oc-test.com/ocid1.datasciencemodeldeployment.oc1.iad.xxxxxxxxx/predict"
+body = {
+    "model": model, # this is a constant
+    "prompt": "what are activation functions?",
+    "max_tokens": 250,
+    "temperature": 0.7,
+    "top_p": 0.8,
+}
+
+res = requests.post(endpoint, json=body, auth=auth, headers={}).json()
+
+print(res)
+```
+
+## Using Python SDK (with streaming)
+
+To consume streaming Server-sent Events (SSE), install [sseclient-py](https://pypi.org/project/sseclient-py/) using `pip install sseclient-py`.
+
+```python
+# The OCI SDK must be installed for this example to function properly.
+# Installation instructions can be found here: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/pythonsdk.htm
+
+import requests
+import oci
+from oci.signer import Signer
+from oci.config import from_file
+import sseclient # pip install sseclient-py
+
+config = from_file('~/.oci/config')
+auth = Signer(
+    tenancy=config['tenancy'],
+    user=config['user'],
+    fingerprint=config['fingerprint'],
+    private_key_file_location=config['key_file'],
+    pass_phrase=config['pass_phrase']
+)
+
+# For security token based authentication
+# token_file = config['security_token_file']
+# token = None
+# with open(token_file, 'r') as f:
+#     token = f.read()
+# private_key = oci.signer.load_private_key_from_file(config['key_file'])
+# auth = oci.auth.signers.SecurityTokenSigner(token, private_key)
+
+model = "<model_name>"
+
+endpoint = "https://modeldeployment.us-ashburn-1.oci.oc-test.com/ocid1.datasciencemodeldeployment.oc1.iad.xxxxxxxxx/predict"
+body = {
+    "model": model, # this is a constant
+    "prompt": "what are activation functions?",
+    "max_tokens": 250,
+    "temperature": 0.7,
+    "top_p": 0.8,
+    "stream": True,
+}
+
+headers={'Content-Type':'application/json','enable-streaming':'true', 'Accept': 'text/event-stream'}
+response = requests.post(endpoint, json=body, auth=auth, stream=True, headers=headers)
+
+print(response.headers)
+
+client = sseclient.SSEClient(response)
+for event in client.events():
+    print(event.data)
+
+# Alternatively, we can use the below code to print the response.
+# for line in response.iter_lines():
+#    if line:
+#        print(line)
+```
+
+## Using Python SDK for /v1/chat/completions endpoint
+
+To access the model deployed with `/v1/chat/completions` endpoint for inference, update the body and replace `prompt` field
+with `messages`.
+
+```python
+...
+body = {
+    "model": "<model_name>", # this is a constant
+    "messages":[{"role":"user","content":[{"type":"text","text":"Who wrote the book Harry Potter?"}]}],
+    "max_tokens": 250,
+    "temperature": 0.7,
+    "top_p": 0.8,
+}
+...
+```
+
+## Using Java (with streaming)
+
+```java
+/**
+ * The OCI SDK must be installed for this example to function properly.
+ * Installation instructions can be found here: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/javasdk.htm
+ */
+package org.example;
+
+import com.oracle.bmc.auth.AuthenticationDetailsProvider;
+import com.oracle.bmc.auth.SessionTokenAuthenticationDetailsProvider;
+import com.oracle.bmc.http.ClientConfigurator;
+import com.oracle.bmc.http.Priorities;
+import com.oracle.bmc.http.client.HttpClient;
+import com.oracle.bmc.http.client.HttpClientBuilder;
+import com.oracle.bmc.http.client.HttpRequest;
+import com.oracle.bmc.http.client.HttpResponse;
+import com.oracle.bmc.http.client.Method;
+import com.oracle.bmc.http.client.jersey.JerseyHttpProvider;
+import com.oracle.bmc.http.client.jersey.sse.SseSupport;
+import com.oracle.bmc.http.internal.ParamEncoder;
+import com.oracle.bmc.http.signing.RequestSigningFilter;
+
+import javax.ws.rs.core.MediaType;
+import java.io.BufferedReader;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.net.URI;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+
+public class RestExample {
+
+    public static void main(String[] args) throws Exception {
+        String configurationFilePath = "~/.oci/config";
+        String profile = "DEFAULT";
+
+        // Pre-Requirement: Allow setting of restricted headers. This is required to allow the SigningFilter
+        // to set the host header that gets computed during signing of the request.
+        System.setProperty("sun.net.http.allowRestrictedHeaders", "true");
+
+        final AuthenticationDetailsProvider provider =
+                new SessionTokenAuthenticationDetailsProvider(configurationFilePath, profile);
+
+        // 1) Create a request signing filter instance using SessionTokenAuth Provider.
+        RequestSigningFilter requestSigningFilter = RequestSigningFilter.fromAuthProvider(
+                provider);
+
+      //  1) Alternatively, RequestSigningFilter can be created from a config file.
+      //  RequestSigningFilter requestSigningFilter = RequestSigningFilter.fromConfigFile(configurationFilePath, profile);
+
+        // 2) Create a Jersey client and register the request signing filter.
+        // Refer to this page https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/javasdkexamples.htm for information regarding the compatibility of the HTTP client(s) with OCI SDK version.
+
+        HttpClientBuilder builder = JerseyHttpProvider.getInstance()
+                .newBuilder()
+                .registerRequestInterceptor(Priorities.AUTHENTICATION, requestSigningFilter)
+                .baseUri(
+                        URI.create(
+                                "${modelDeployment.modelDeploymentUrl}/")
+                                + ParamEncoder.encodePathParam("predict"));
+        // 3) Create a request and set the expected type header.
+
+        String jsonPayload = "{}";  // Add payload here with respect to your model example shown in next line:
+
+        // 4) Setup Streaming request
+        Function<InputStream, List<String>> generateTextResultReader = getInputStreamListFunction();
+        SseSupport sseSupport = new SseSupport(generateTextResultReader);
+        ClientConfigurator clientConfigurator = sseSupport.getClientConfigurator();
+        clientConfigurator.customizeClient(builder);
+
+        try (HttpClient client = builder.build()) {
+            HttpRequest request = client
+                    .createRequest(Method.POST)
+                    .header("accepts", MediaType.APPLICATION_JSON)
+                    .header("content-type", MediaType.APPLICATION_JSON)
+                    .header("enable-streaming", "true")
+                    .body(jsonPayload);
+
+            // 5) Invoke the call and get the response.
+            HttpResponse response = request.execute().toCompletableFuture().get();
+
+            // 6) Print the response headers and body
+            Map<String, List<String>> responseHeaders = response.headers();
+            System.out.println("HTTP Headers " + responseHeaders);
+
+            InputStream responseBody = response.streamBody().toCompletableFuture().get();
+            try (
+                    final BufferedReader reader = new BufferedReader(
+                            new InputStreamReader(responseBody, StandardCharsets.UTF_8)
+                    )
+            ) {
+                String line;
+                while ((line = reader.readLine()) != null) {
+                    System.out.println(line);
+                }
+            }
+        } catch (Exception ex) {
+            throw ex;
+        }
+    }
+
+    private static Function<InputStream, List<String>> getInputStreamListFunction() {
+        Function<InputStream, List<String>> generateTextResultReader = entityStream -> {
+            try (BufferedReader reader =
+                         new BufferedReader(new InputStreamReader(entityStream))) {
+                String line;
+                List<String> generatedTextList = new ArrayList<>();
+                while ((line = reader.readLine()) != null) {
+                    if (line.isEmpty() || line.startsWith(":")) {
+                        continue;
+                    }
+                    generatedTextList.add(line);
+                }
+                return generatedTextList;
+            } catch (Exception ex) {
+                throw new RuntimeException(ex);
+            }
+        };
+        return generateTextResultReader;
+    }
+}
+
+```
+
+# Stacked Model Evaluation
+
+## Create Model Evaluations
+
+### Description
+
+Creates a new evaluation model using an existing Aqua Stacked deployment. For Stacked deployment, evaluations must be created separately for each model using the same model deployment OCID.
+
+### Usage
+
+```bash
+ads aqua evaluation create [OPTIONS]
+```
+
+### Required Parameters
+
+`--evaluation_source_id [str]`
+
+The evaluation source id. Must be Stacked deployment OCID.
+
+`--evaluation_name [str]`
+
+The name for evaluation.
+
+`--dataset_path [str]`
+
+The dataset path for the evaluation. Must be an object storage path. <br>
+Example: `oci://<bucket>@<namespace>/path/to/the/dataset.jsonl`
+
+`--report_path [str]`
+
+The report path for the evaluation. Must be an object storage path. <br>
+Example: `oci://<bucket>@<namespace>/report/path/`
+
+`--model_parameters [str]`
+
+The parameters for the evaluation. The `"model"` is required evaluation param in case of Stacked deployment.
+
+`--shape_name [str]`
+
+The shape name for the evaluation job infrastructure. <br>
+Example: `VM.Standard.E3.Flex, VM.Standard.E4.Flex, VM.Standard3.Flex, VM.Optimized3.Flex`.
+
+`--block_storage_size [int]`
+
+The storage for the evaluation job infrastructure.
+
+### Optional Parameters
+
+`--compartment_id [str]`
+
+The compartment OCID where evaluation is to be created. If not provided, then it defaults to user's compartment.
+
+`--project_id [str]`
+
+The project OCID where evaluation is to be created. If not provided, then it defaults to user's project.
+
+`--evaluation_description [str]`
+
+The description of the evaluation. Defaults to None.
+
+`--memory_in_gbs [float]`
+
+The memory in gbs for the flexible shape selected.
+
+`--ocpus [float]`
+
+The ocpu count for the shape selected.
+
+`--experiment_id [str]`
+
+The evaluation model version set id. If provided, evaluation model will be associated with it. Defaults to None. <br>
+
+`--experiment_name [str]`
+
+The evaluation model version set name. If provided, the model version set with the same name will be used if exists, otherwise a new model version set will be created with the name.
+
+`--experiment_description [str]`
+
+The description for the evaluation model version set.
+
+`--log_group_id [str]`
+
+The log group id for the evaluation job infrastructure. Defaults to None.
+
+`--log_id [str]`
+
+The log id for the evaluation job infrastructure. Defaults to None.
+
+`--metrics [list]`
+
+The metrics for the evaluation, currently BERTScore and ROGUE are supported. <br>
+Example: `'[{"name": "bertscore", "args": {}}, {"name": "rouge", "args": {}}]`
+
+`--force_overwrite [bool]`
+
+A flag to indicate whether to force overwrite the existing evaluation file in object storage if already present. Defaults to `False`.
+
+### Example
+
+```bash
+ads aqua evaluation create \
+    --evaluation_source_id "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>" \
+    --evaluation_name "test_evaluation" \
+    --dataset_path "oci://<bucket>@<namespace>/path/to/the/dataset.jsonl" \
+    --report_path "oci://<bucket>@<namespace>/report/path/" \
+    --model_parameters '{"model":"<model_name>","max_tokens": 500, "temperature": 0.7, "top_p": 1.0, "top_k": 50}' \
+    --shape_name "VM.Standard.E4.Flex" \
+    --block_storage_size 50 \
+    --metrics '[{"name": "bertscore", "args": {}}, {"name": "rouge", "args": {}}]'
+```
+
+#### CLI Output
+
+```json
+{
+    "id": "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "name": "test_evaluation",
+    "aqua_service_model": true,
+    "state": "CREATING",
+    "description": null,
+    "created_on": "2024-02-03 21:21:31.952000+00:00",
+    "created_by": "ocid1.user.oc1..<ocid>",
+    "endpoint": "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>",
+    "console_link": "https://cloud.oracle.com/data-science/model-deployments/ocid1.datasciencemodeldeployment.oc1.iad.<ocid>?region=us-ashburn-1",
+    "shape_info": {
+        "instance_shape": "VM.Standard.E4.Flex",
+        "instance_count": 1,
+        "ocpus": 1.0,
+        "memory_in_gbs": 16.0
+    },
+    "tags": {
+        "aqua_service_model": "ocid1.datasciencemodel.oc1.iad.<ocid>#Mistral-7B-v0.1",
+        "OCI_AQUA": ""
+    }
+}
+```
+
+For other operations related to **Evaluation**, such as listing evaluations and retrieving evaluation details, please refer to [AQUA CLI tips](cli-tips.md)
diff --git a/ai-quick-actions/web_assets/deploy-multi-model-advanced-options.png b/ai-quick-actions/web_assets/deploy-multi-model-advanced-options.png
new file mode 100644
index 00000000..e7217ea7
Binary files /dev/null and b/ai-quick-actions/web_assets/deploy-multi-model-advanced-options.png differ
diff --git a/ai-quick-actions/web_assets/deploy-multi.png b/ai-quick-actions/web_assets/deploy-multi.png
new file mode 100644
index 00000000..3268dfd9
Binary files /dev/null and b/ai-quick-actions/web_assets/deploy-multi.png differ
diff --git a/ai-quick-actions/web_assets/deploy-stack-model-advanced-options.png b/ai-quick-actions/web_assets/deploy-stack-model-advanced-options.png
new file mode 100644
index 00000000..083bfab0
Binary files /dev/null and b/ai-quick-actions/web_assets/deploy-stack-model-advanced-options.png differ
diff --git a/ai-quick-actions/web_assets/deploy-stack.png b/ai-quick-actions/web_assets/deploy-stack.png
new file mode 100644
index 00000000..25093133
Binary files /dev/null and b/ai-quick-actions/web_assets/deploy-stack.png differ
diff --git a/ai-quick-actions/web_assets/try-multi-model.png b/ai-quick-actions/web_assets/try-multi-model.png
new file mode 100644
index 00000000..dd36f231
Binary files /dev/null and b/ai-quick-actions/web_assets/try-multi-model.png differ
diff --git a/ai-quick-actions/web_assets/try-stack-model.png b/ai-quick-actions/web_assets/try-stack-model.png
new file mode 100644
index 00000000..d5bff24a
Binary files /dev/null and b/ai-quick-actions/web_assets/try-stack-model.png differ