Implementation:Predibase Lorax Helm Chart Configuration
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Observability |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
Concrete tool for Kubernetes deployment and monitoring configuration provided by the LoRAX Helm chart and OpenTelemetry setup.
Description
The LoRAX Helm chart (charts/lorax/) provides a declarative Kubernetes deployment including GPU resource limits, container image configuration, health probes, and launcher CLI arguments. The setup_tracing() function in server/lorax_server/tracing.py configures OpenTelemetry distributed tracing with OTLP export for each shard process.
Usage
Use the Helm chart to deploy LoRAX to Kubernetes clusters. Configure via values.yaml overrides. OpenTelemetry tracing is automatically configured when an OTLP endpoint is provided.
Code Reference
Source Location
- Repository: LoRAX
- File: charts/lorax/values.yaml (Lines: 1-75)
- File: server/lorax_server/tracing.py (Lines: 56-62)
Signature
# charts/lorax/values.yaml - Key configuration sections
deployment:
replicas: 1
image:
repository: "ghcr.io/predibase/lorax"
tag: "latest"
args:
- name: "--model-id"
value: "mistralai/Mistral-7B-Instruct-v0.1"
resources:
limits:
nvidia.com/gpu: "1"
# server/lorax_server/tracing.py
def setup_tracing(shard: int, otlp_endpoint: str):
"""Configure OpenTelemetry tracing for a shard process."""
Import
# Helm chart installation
helm install lorax ./charts/lorax -f custom-values.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| values.yaml | YAML | Yes | Helm values with deployment config |
| shard | int | Yes | Shard index for tracing identification |
| otlp_endpoint | str | No | OpenTelemetry collector endpoint |
Outputs
| Name | Type | Description |
|---|---|---|
| Kubernetes resources | Deployment+Service | Running LoRAX pod with GPU resources |
| Trace spans | OTLP export | Distributed traces to collector |
Usage Examples
Deploy with Helm
# Install LoRAX to Kubernetes
helm install my-lorax ./charts/lorax \
--set deployment.image.tag=v0.9.0 \
--set 'deployment.args[0].value=meta-llama/Llama-2-7b-hf' \
--set 'deployment.resources.limits.nvidia\.com/gpu=2'