cortex.tensorrt-llm
janhq/cortex.tensorrt-llm
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
42stars
Forks
3
Open issues
3
Watchers
42
Size
279.2 MB
C++Apache License 2.0
janllmnvidiatensorrttensorrt-llm
Created: Mar 4, 2024
Updated: Aug 29, 2025
Last push: Sep 26, 2024
ArchivedFork