|
| 1 | +# Guide to update models following TGI Deprecation |
| 2 | + |
| 3 | +## List of model's container upgraded from TGI to VLLM |
| 4 | +1. **codegemma-1.1-2b** |
| 5 | +2. **codegemma-1.1-7b-it** |
| 6 | +3. **codegemma-2b** |
| 7 | +4. **codegemma-7b** |
| 8 | +5. **falcon-40b-instruct** |
| 9 | +6. **gemma-1.1-7b-it** |
| 10 | +7. **gemma-2b** |
| 11 | +8. **gemma-2b-it** |
| 12 | +9. **gemma-7b** |
| 13 | + |
| 14 | +## Impact |
| 15 | +Any model deployment associated with the above models will fail on restart. |
| 16 | + |
| 17 | +## Mitigation 1 - Update the existing registered model |
| 18 | +If you have registered one of the above models, update the model configuration to use **VLLM** container in place of **TGI** during deployment. To update the registered model and create new deployment on **VLLM** follow the below instructions. |
| 19 | + |
| 20 | +1. Use the **Get Model** Python SDK to get the details for the target model. |
| 21 | + |
| 22 | +``` |
| 23 | +# get model |
| 24 | +import oci |
| 25 | +from ads import set_auth |
| 26 | +from oci.auth.signers import get_resource_principals_signer |
| 27 | +from oci.data_science import DataScienceClient |
| 28 | +
|
| 29 | +set_auth(auth='resource_principal') |
| 30 | +
|
| 31 | +resource_principal = get_resource_principals_signer() |
| 32 | +data_science_client = DataScienceClient(config={}, signer=resource_principal) |
| 33 | +
|
| 34 | +get_model_response = data_science_client.get_model( |
| 35 | + model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaay75uckqare76rtyeaghvtsakscn2d6xscrvwyq7f6txc32742m3a") |
| 36 | +
|
| 37 | +# Get the data from response |
| 38 | +print(get_model_response.data) |
| 39 | +
|
| 40 | +``` |
| 41 | +Refer [document](https://docs.oracle.com/en-us/iaas/tools/python-sdk-examples/2.162.0/datascience/get_model.py.html) for more details on get model SDK code. |
| 42 | + |
| 43 | + |
| 44 | +2. Use the response from the **Get Model** call to update the custom_metadata_list within the **Update Model** Python SDK code. Specifically for changing the **deployment-container**, you must: |
| 45 | + |
| 46 | +* Change the value for the **custom_metadata_list** attribute with the key **deployment-container** to **odsc-vllm-serving**. |
| 47 | +* Copy all other **custom_metadata_list** attributes (key/value pairs) exactly as they appear in the **Get Model** response, ensuring no existing metadata is accidentally dropped during the update |
| 48 | + |
| 49 | + |
| 50 | +``` |
| 51 | +
|
| 52 | +import oci |
| 53 | +from ads import set_auth |
| 54 | +from oci.auth.signers import get_resource_principals_signer |
| 55 | +from oci.data_science import DataScienceClient |
| 56 | +
|
| 57 | +set_auth(auth='resource_principal') |
| 58 | +
|
| 59 | +resource_principal = get_resource_principals_signer() |
| 60 | +data_science_client = DataScienceClient(config={}, signer=resource_principal) |
| 61 | +
|
| 62 | +# Send the request to service, some parameters are not required, see API |
| 63 | +# doc for more info |
| 64 | +update_model_response = data_science_client.update_model( |
| 65 | + model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaay75uckqare76rtyeaghvtsakscn2d6xscrvwyq7f6txc32742m3a", |
| 66 | + update_model_details=oci.data_science.models.UpdateModelDetails( |
| 67 | + custom_metadata_list=[ |
| 68 | + oci.data_science.models.Metadata( |
| 69 | + category="Other", |
| 70 | + description="Deployment container mapping for SMC", |
| 71 | + key="deployment-container", |
| 72 | + value="odsc-vllm-serving", |
| 73 | + has_artifact=False), |
| 74 | + oci.data_science.models.Metadata( |
| 75 | + category="Other", |
| 76 | + description="Fine-tuning container mapping for SMC", |
| 77 | + key="finetune-container", |
| 78 | + value="odsc-llm-fine-tuning", |
| 79 | + has_artifact=False), |
| 80 | + oci.data_science.models.Metadata( |
| 81 | + category="Other", |
| 82 | + description="model by reference flag", |
| 83 | + key="modelDescription", |
| 84 | + value="true", |
| 85 | + has_artifact=False), |
| 86 | + oci.data_science.models.Metadata( |
| 87 | + category="Other", |
| 88 | + description="artifact location", |
| 89 | + key="artifact_location", |
| 90 | + value="oci://aqua-test-prod@idtlxnfdweil/custom-models/gemma-2b", |
| 91 | + has_artifact=False), |
| 92 | + oci.data_science.models.Metadata( |
| 93 | + category="Other", |
| 94 | + description="Evaluation container mapping for SMC", |
| 95 | + key="evaluation-container", |
| 96 | + value="odsc-llm-evaluate", |
| 97 | + has_artifact=False),])) |
| 98 | +
|
| 99 | +# Get the data from response |
| 100 | +print(update_model_response.data) |
| 101 | +
|
| 102 | +``` |
| 103 | +Refer [document](https://docs.oracle.com/en-us/iaas/tools/python-sdk-examples/2.162.0/datascience/update_model.py.html) for more details on update model SDK code. |
| 104 | + |
| 105 | +3. After updating the existing model, wait a short time for the changes to synchronize. Then, create a new model deployment; it will now launch the deployment using the **VLLM** container. |
| 106 | + |
| 107 | + |
| 108 | + |
| 109 | +## Mitigation 2 - Re-register the above models using latest service managed model version |
| 110 | + |
| 111 | +1. Register the new model which uses the **VLLM** container (Refer to [register model](register-tips.md) document). |
| 112 | + |
| 113 | +After register the model after the service deprecates **TGI**, the above model will show **VLLM** container instead of **TGI** |
| 114 | + |
| 115 | + |
0 commit comments