vLLM

@vllm-projectOrganization

Public repos

Public gists

Member since

Jun 18, 2023

On the leaderboard

Rank	Repository	Stars
165	vllm-project/vllm	83,432

Top repositories by stars

vllm-project/vllm(on leaderboard)
A high-throughput and memory-efficient inference and serving engine for LLMs
Python70,538
vllm-project/aibrix
Cost-efficient and pluggable Infrastructure components for GenAI inference
Go4,627
vllm-project/semantic-router
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Go3,197
vllm-project/vllm-omni
A framework for efficient model inference with omni-modality models
Python2,763
vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Python2,756
vllm-project/production-stack
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Python2,174
vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
C++1,676
vllm-project/guidellm
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
Python856
vllm-project/vllm-metal
Community maintained hardware plugin for vLLM on Apple Silicon
Python466
vllm-project/recipes
Common recipes to run vLLM
Jupyter Notebook440
vllm-project/compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
Python249
vllm-project/speculators
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Python241
vllm-project/tpu-inference
TPU inference for vLLM, with unified JAX and PyTorch support.
Python235
vllm-project/router
A high-performance and light-weight router for vLLM large scale deployment
Rust121
vllm-project/flash-attention
Fast and memory-efficient exact attention
Python114
vllm-project/vllm-spyre
Community maintained hardware plugin for vLLM on Spyre
Python42
vllm-project/dashboard
vLLM performance dashboard
Python41
vllm-project/vllm-daily
vLLM Daily Summarization of Merged PRs
40
vllm-project/vllm-openvino
Python40
vllm-project/vllm-skills
Agent skills for vLLM
Shell32
vllm-project/ci-infra
This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
HCL29
vllm-project/vllm-project.github.io
JavaScript29
vllm-project/vllm-gaudi
Community maintained hardware plugin for vLLM on Intel Gaudi
Python26
vllm-project/vllm-neuron
Community maintained hardware plugin for vLLM on AWS Neuron
Python22
vllm-project/vllm-xpu-kernels
The vLLM XPU kernels for Intel GPU
C++21
vllm-project/vllm-nccl
Manages vllm-nccl dependency
Python17
vllm-project/vLLM-in-PyTorch-Conference-2025
11
vllm-project/FlashMLA
C++11
vllm-project/vllm-project.github.io-static
HTML9
vllm-project/bart-plugin
vLLM Model plugin for the encoder-decoder BART model
Python7
vllm-project/media-kit
vLLM Logo Assets
6
vllm-project/rfcs
1
vllm-project/perf-dashboard
Performance dashboard for vLLM
Python0
vllm-project/DeepGEMM
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Cuda0