Features

LLMariner features

Inference with Open Models

Users can run chat completion with open models such as Google Gemma, LLama, Mistral, etc. To run chat completion, users can use the OpenAI Python library, llma CLI, or API endpoint.

Supported Open Models

The following shows the supported models.

Retrieval-Augmented Generation (RAG)

This page describes how to use RAG with LLMariner.

Model Fine-tuning

This page describes how to fine-tune models with LLMariner.

General-purpose Training

LLMariner allows users to run general-purpose training jobs in their Kubernetes clusters.

Jupyter Notebook

LLMariner allows users to run a Jupyter Notebook in a Kubernetes cluster. This functionality is useful when users want to run ad-hoc Python scripts that require GPU.

API and GPU Usage Visibility

User Management

Describes the way to manage users

Access Control with Organizations and Projects

The way to configure access control using organizations and projects