This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Integration

Integrate with other projects

1 - Open WebUI

Integrate with Open WebUI and get the web UI for the AI assistant.

Open WebUI provides a web UI that works with OpenAI-compatible APIs. You can run Openn WebUI locally or run in a Kubernetes cluster.

Here is an instruction for running Open WebUI in a Kubernetes cluster.

OPENAI_API_KEY=<LLMariner API key>
OPEN_API_BASE_URL=<LLMariner API endpoint>

kubectl create namespace open-webui
kubectl create secret generic -n open-webui llmariner-api-key --from-literal=key=${OPENAI_API_KEY}

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: open-webui
  namespace: open-webui
spec:
  selector:
    matchLabels:
      name: open-webui
  template:
    metadata:
      labels:
        name: open-webui
    spec:
      containers:
      - name: open-webui
        image: ghcr.io/open-webui/open-webui:main
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        env:
        - name: OPENAI_API_BASE_URLS
          value: ${OPEN_API_BASE_URL}
        - name: WEBUI_AUTH
          value: "false"
        - name: OPENAI_API_KEYS
          valueFrom:
            secretKeyRef:
              name: llmariner-api-key
              key: key
---
apiVersion: v1
kind: Service
metadata:
  name: open-webui
  namespace: open-webui
spec:
  type: ClusterIP
  selector:
    name: open-webui
  ports:
  - port: 8080
    name: http
    targetPort: http
    protocol: TCP
EOF

You can then access Open WebUI with port forwarding:

kubectl port-forward -n open-webui service/open-webui 8080

2 - Continue

Integrate with Continue and provide an open source AI code assistant.

Continue provides an open source AI code assistant. You can use LLMariner as a backend endpoint for Continue.

As LLMariner provides the OpenAI compatible API, you can set the provider to "openai". apiKey is set to an API key generated by LLMariner, and apiBase is set to the endpoint URL of LLMariner (e.g., http://localhost:8080/v1).

Here is an example configuration that you can put at ~/.continue/config.json.

{
  "models": [
    {
      "title": "Meta-Llama-3.1-8B-Instruct-q4",
      "provider": "openai",
      "model": "meta-llama-Meta-Llama-3.1-8B-Instruct-q4",
      "apiKey": "<LLMariner API key>",
      "apiBase": "<LLMariner endpoint>"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Auto complete",
    "provider": "openai",
    "model": "deepseek-ai-deepseek-coder-6.7b-base-q4",
    "apiKey": "<LLMariner API key>",
    "apiBase": "<LLMariner endpoint>",
    "completionOptions": {
      "presencePenalty": 1.1,
      "frequencyPenalty": 1.1
    },
  },
  "allowAnonymousTelemetry": false
}

The following is a demo video that shows the Continue integration that enables the coding assistant with Llama-3.1-Nemotron-70B-Instruct.

3 - MLflow

Integrate with MLflow.

MLflow is an open-source tool for managing the machine learning lifecycle. It has various features for LLMs (link) and integration with OpenAI. We can apply these MLflow features to the LLM endpoints provided by LLMariner.

For example, you can deploy a MLflow Deployments Server for LLMs and use Prompt Engineering UI.

Deploying MLflow Tracking Server

Bitmani provides a Helm chart for MLflow.

helm upgrade \
  --install \
  --create-namespace \
  -n mlflow \
  mlflow oci://registry-1.docker.io/bitnamicharts/mlflow \
  -f values.yaml

An example values.yaml is following:

tracking:
  extraEnvVars:
  - name: MLFLOW_DEPLOYMENTS_TARGET
    value: http://deployment-server:7000

We set MLFLOW_DEPLOYMENTS_TARGET to the address of a MLflow Deployments Server that we will deploy in the next section.

Once deployed, you can set up port-forwarding and access http://localhost:9000.

kubectl port-forward -n mlflow service/mlflow-tracking 9000:80

The login credentials are obtained by the following commands:

# User
kubectl get secret --namespace mlflow mlflow-tracking -o jsonpath="{ .data.admin-user }" | base64 -d
# Password
kubectl get secret --namespace mlflow mlflow-tracking -o jsonpath="{.data.admin-password }" | base64 -d

Deploying MLflow Deployments Server for LLMs

We have an example K8s YAML for deploying a MLflow deployments server here.

You can save it locally, up openai_api_base in the ConfigMap definition based on your ingress controller address, and then run:

kubectl create secret generic -n mlflow llmariner-api-key \
  --from-literal=secret=<Your API key>

kubectl apply -n mlflow -f deployment-server.yaml

You can then access the MLflow Tracking Server, click "New run", and choose "using Prompt Engineering".

Other Features

Please visit MLflow page for more information for other LLM related features provided by MLflow.

4 - Weights & Biases (W&B)

Integration with W&B and see the progress of your fine-tuning jobs.

Weights and Biases (W&B) is an AI developer platform. LLMariner provides the integration with W&B so that metrics for fine-tuning jobs are reported to W&B. With the integration, you can easily see the progress of your fine-tuning jobs, such as training epoch, loss, etc.

Please take the following steps to enable the integration.

First, obtain the API key of W&B and create a Kubernetes secret.

kubectl create secret generic wandb
  -n <fine-tuning job namespace> \
  --from-literal=apiKey=${WANDB_API_KEY}

The secret needs to be created in a namespace where fine-tuning jobs run. Individual projects specify namespaces for fine-tuning jobs, and the default project runs fine-tuning jobs in the "default" namespace.

Then you can enable the integration by adding the following to your Helm values.yaml and re-deploying LLMariner.

job-manager-dispatcher:
  job:
    wandbApiKeySecret:
      name: wandb
      key: apiKey

A fine-tuning job will report to W&B when the integration parameter is specified.

job = client.fine_tuning.jobs.create(
  model="google-gemma-2b-it",
  suffix="fine-tuning",
  training_file=tfile.id,
  validation_file=vfile.id,
  integrations=[
    {
      "type": "wandb",
      "wandb": {
         "project": "my-test-project",
      },
    },
  ],
)

Here is an example screenshot. You can see metrics like train/loss in the W&B dashboard.