Skip to content

Commit de75b0b

Browse files
[BugFix] Fix initialization of draft model. (#29319)
Signed-off-by: Andrey Khalyavin <halyavin@yandex-team.ru> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
1 parent 7df0289 commit de75b0b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3460,6 +3460,10 @@ def load_model(self, eep_scale_up: bool = False) -> None:
34603460
scope="local",
34613461
)
34623462
prepare_communication_buffer_for_model(self.model)
3463+
if (drafter := getattr(self, "drafter", None)) and (
3464+
drafter_model := getattr(drafter, "model", None)
3465+
):
3466+
prepare_communication_buffer_for_model(drafter_model)
34633467
mm_config = self.model_config.multimodal_config
34643468
self.is_multimodal_pruning_enabled = (
34653469
supports_multimodal_pruning(self.get_model())

0 commit comments

Comments
 (0)