Inference with Open Models
Users can run chat completion with open models such as Google Gemma, LLama, Mistral, etc. To run chat completion, users can use the OpenAI Python library, llma
CLI, or API endpoint.
Users can run chat completion with open models such as Google Gemma, LLama, Mistral, etc. To run chat completion, users can use the OpenAI Python library, llma
CLI, or API endpoint.
The following shows the supported models.
This page describes how to use RAG with LLMariner.
This page describes how to fine-tune models with LLMariner.
LLMariner allows users to run general-purpose training jobs in their Kubernetes clusters.
LLMariner allows users to run a Jupyter Notebook in a Kubernetes cluster. This functionality is useful when users want to run ad-hoc Python scripts that require GPU.
Describes the way to manage users
The way to configure access control using organizations and projects