CVE-2026-44222
vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a a Toke
vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control.
Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. This vulnerability is fixed in 0.20.0.
MEDIUM · CVSS 6.5
EPSS 0.00014
Schedule remediation
- Public exploit or PoC is available
Sigma rules0
YARA rules0