-
Notifications
You must be signed in to change notification settings - Fork 521
Description
Checklist
- [ x ] I've prepended issue tag with type of change: [feature]
- (If applicable) I've documented below the DLC image/dockerfile this relates to
- (If applicable) I've documented the tests I've run on the DLC image
- I'm using an existing DLC image listed here: https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
- I've built my own container based off DLC (and I've attached the code used to build my own image)
Concise Description:
I am interested in running some models through VLLM on an LMI on a neuron instance. However, the latest neuron LMI 763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.30.0-neuronx-sdk2.20.1 from the available_images.md is out of date, using DJL Serving 0.30 when the GPU containers are on 0.34. For my purposes, I'd need at least transformers-4.51.0.
For my purposes as long as there is a container with newer transformers then I'm satisfied, though updating the DJL serving framework would also be nice.
DLC image/dockerfile:
763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.30.0-neuronx-sdk2.20.1
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
I can't run modern LLM on neuron due to the neuron DJL being out of date.
Describe the solution you'd like
A clear and concise description of what you want to happen.
I'd like an updated neuron LMI.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Use GPU instead at a higher cost
Additional context
Add any other context or screenshots about the feature request here.