LLMariner
Documentation
GitHub
Light
Dark
Auto
Documentation
Features
API/GPU Usage Optimization
API and GPU Usage Optimization
Note
Work-in-progress.
API Usage Visibility
Inference Request Rate-limiting
Optimize GPU Utilization
Auto-scaling of Inference Runtimes
Scheduled Scale Up and Down of Inference Runtimes
Last modified November 19, 2024:
docs: add placeholders for upcoming features (#76) (b9e1216)