vllm CVE Vulnerabilities & Metrics

Focus on vllm vulnerabilities and metrics.

Last updated: 07 Jun 2025, 22:25 UTC

About vllm Security Exposure

This page consolidates all known Common Vulnerabilities and Exposures (CVEs) associated with vllm. We track both calendar-based metrics (using fixed periods) and rolling metrics (using gliding windows) to give you a comprehensive view of security trends and risk evolution. Use these insights to assess risk and plan your patching strategy.

For a broader perspective on cybersecurity threats, explore the comprehensive list of CVEs by vendor and product. Stay updated on critical vulnerabilities affecting major software and hardware providers.

Global CVE Overview

Total vllm CVEs: 3
Earliest CVE date: 30 Apr 2025, 01:15 UTC
Latest CVE date: 30 Apr 2025, 01:15 UTC

Latest CVE reference: CVE-2025-46560

Rolling Stats

30-day Count (Rolling): 0
365-day Count (Rolling): 3

Calendar-based Variation

Calendar-based Variation compares a fixed calendar period (e.g., this month versus the same month last year), while Rolling Growth Rate uses a continuous window (e.g., last 30 days versus the previous 30 days) to capture trends independent of calendar boundaries.

Variations & Growth

Month Variation (Calendar): -100.0%
Year Variation (Calendar): 0%

Month Growth Rate (30-day Rolling): -100.0%
Year Growth Rate (365-day Rolling): 0.0%

Monthly CVE Trends (current vs previous Year)

Annual CVE Trends (Last 20 Years)

Critical vllm CVEs (CVSS ≥ 9) Over 20 Years

CVSS Stats

Average CVSS: 0.0

Max CVSS: 0

Critical CVEs (≥9): 0

CVSS Range vs. Count

Range Count
0.0-3.9 3
4.0-6.9 0
7.0-8.9 0
9.0-10.0 0

CVSS Distribution Chart

Top 5 Highest CVSS vllm CVEs

These are the five CVEs with the highest CVSS scores for vllm, sorted by severity first and recency.

All CVEs for vllm

CVE-2025-46560 vllm vulnerability CVSS: 0 30 Apr 2025, 01:15 UTC

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to ​​inefficient list concatenation operations​​, the algorithm exhibits ​​quadratic time complexity (O(n²))​​, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.

CVE-2025-32444 vllm vulnerability CVSS: 0 30 Apr 2025, 01:15 UTC

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.

CVE-2025-30202 vllm vulnerability CVSS: 0 30 Apr 2025, 01:15 UTC

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.