On the leaderboard
| Rank | Repository | Stars |
|---|---|---|
| 181 | vllm-project/vllm | 75,260 |
Top repositories by stars
- vllm-project/vllm(on leaderboard)
A high-throughput and memory-efficient inference and serving engine for LLMs
Python70,538 - vllm-project/aibrix
Cost-efficient and pluggable Infrastructure components for GenAI inference
Go4,627 - vllm-project/semantic-router
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Go3,197 - vllm-project/vllm-omni
A framework for efficient model inference with omni-modality models
Python2,763 - vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Python2,756 - vllm-project/production-stack
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Python2,174 - vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
C++1,676 - vllm-project/guidellm
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
Python856 - vllm-project/vllm-metal
Community maintained hardware plugin for vLLM on Apple Silicon
Python466 - vllm-project/recipes
Common recipes to run vLLM
Jupyter Notebook440 - vllm-project/compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
Python249 - vllm-project/speculators
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Python241 - vllm-project/tpu-inference
TPU inference for vLLM, with unified JAX and PyTorch support.
Python235 - vllm-project/router
A high-performance and light-weight router for vLLM large scale deployment
Rust121 - vllm-project/flash-attention
Fast and memory-efficient exact attention
Python114 - vllm-project/vllm-spyre
Community maintained hardware plugin for vLLM on Spyre
Python42 - vllm-project/dashboard
vLLM performance dashboard
Python41 - vllm-project/vllm-daily
vLLM Daily Summarization of Merged PRs
40 - Python40
- vllm-project/vllm-skills
Agent skills for vLLM
Shell32 - vllm-project/ci-infra
This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
HCL29 - JavaScript29
- vllm-project/vllm-gaudi
Community maintained hardware plugin for vLLM on Intel Gaudi
Python26 - vllm-project/vllm-neuron
Community maintained hardware plugin for vLLM on AWS Neuron
Python22 - vllm-project/vllm-xpu-kernels
The vLLM XPU kernels for Intel GPU
C++21 - vllm-project/vllm-nccl
Manages vllm-nccl dependency
Python17 - vllm-project/bart-plugin
vLLM Model plugin for the encoder-decoder BART model
Python7 - vllm-project/media-kit
vLLM Logo Assets
6 - vllm-project/perf-dashboard
Performance dashboard for vLLM
Python0 - vllm-project/DeepGEMM
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Cuda0