Implementation:Predibase Lorax Helm Chart Configuration

Knowledge Sources	LoRAX Helm
Domains	Infrastructure, Observability
Last Updated	2026-02-08 02:00 GMT

Overview

Concrete tool for Kubernetes deployment and monitoring configuration provided by the LoRAX Helm chart and OpenTelemetry setup.

Description

The LoRAX Helm chart (charts/lorax/) provides a declarative Kubernetes deployment including GPU resource limits, container image configuration, health probes, and launcher CLI arguments. The setup_tracing() function in server/lorax_server/tracing.py configures OpenTelemetry distributed tracing with OTLP export for each shard process.

Usage

Use the Helm chart to deploy LoRAX to Kubernetes clusters. Configure via values.yaml overrides. OpenTelemetry tracing is automatically configured when an OTLP endpoint is provided.

Code Reference

Source Location

Repository: LoRAX
File: charts/lorax/values.yaml (Lines: 1-75)
File: server/lorax_server/tracing.py (Lines: 56-62)

Signature

# charts/lorax/values.yaml - Key configuration sections
deployment:
  replicas: 1
  image:
    repository: "ghcr.io/predibase/lorax"
    tag: "latest"
  args:
    - name: "--model-id"
      value: "mistralai/Mistral-7B-Instruct-v0.1"
  resources:
    limits:
      nvidia.com/gpu: "1"

# server/lorax_server/tracing.py
def setup_tracing(shard: int, otlp_endpoint: str):
    """Configure OpenTelemetry tracing for a shard process."""

Import

# Helm chart installation
helm install lorax ./charts/lorax -f custom-values.yaml

I/O Contract

Inputs

Name	Type	Required	Description
values.yaml	YAML	Yes	Helm values with deployment config
shard	int	Yes	Shard index for tracing identification
otlp_endpoint	str	No	OpenTelemetry collector endpoint

Outputs

Name	Type	Description
Kubernetes resources	Deployment+Service	Running LoRAX pod with GPU resources
Trace spans	OTLP export	Distributed traces to collector

Usage Examples

Deploy with Helm

# Install LoRAX to Kubernetes
helm install my-lorax ./charts/lorax \
  --set deployment.image.tag=v0.9.0 \
  --set 'deployment.args[0].value=meta-llama/Llama-2-7b-hf' \
  --set 'deployment.resources.limits.nvidia\.com/gpu=2'

Related Pages

Implements Principle

Principle:Predibase_Lorax_Production_Observability

Requires Environment

Environment:Predibase_Lorax_Docker_Container_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment