Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ray project Ray Azure Ray Template

From Leeroopedia
Knowledge Sources Ray
Domains Cloud_Deployment, Infrastructure, Azure
Last Updated 2026-02-13

Overview

Azure Ray Template is an Azure Resource Manager (ARM) deployment template that provisions a complete Ray cluster with configurable head and worker virtual machines, networking, and optional GPU support on Microsoft Azure.

Description

This ARM template (azure-ray-template.json) declares all necessary Azure resources to deploy a Ray cluster, including a virtual network with separate subnets for head and worker nodes, a network security group with rules for SSH, JupyterLab, Ray Web UI, and TensorBoard access, public IP addresses, network interfaces, and virtual machine scale sets. The template parameterizes VM sizes, node counts (initial, min, max), spot/regular priority, conda environments, Python packages, and Ray wheel URLs, then invokes an initialization script (azure-init.sh) via the Azure Custom Script Extension to bootstrap Ray on each node. It uses the Microsoft Data Science Virtual Machine (DSVM) Ubuntu 18.04 image as the base OS, providing pre-installed ML frameworks and conda environments such as py38_tensorflow and py38_pytorch.

Usage

Use this template when deploying a Ray cluster on Azure through the Azure Portal or Azure CLI. Modify the parameters section to customize VM sizes, worker node scaling, conda environments, and Python package installations. This template is suitable for users who need a turnkey distributed Ray environment on Azure with auto-scaling worker nodes and GPU support.

Code Reference

Source Location

doc/azure/azure-ray-template.json

Signature

{
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "adminUsername": { "type": "string", "defaultValue": "ubuntu" },
        "publicKey": { "type": "securestring" },
        "adminPassword": { "type": "securestring" },
        "headNodeSize": { "type": "string", "defaultValue": "Standard_D2s_v3" },
        "headNodePriority": { "type": "string", "defaultValue": "Regular" },
        "workerNodeSize": { "type": "string", "defaultValue": "Standard_D2s_v3" },
        "workerNodePriority": { "type": "string", "defaultValue": "Spot" },
        "workerInitial": { "type": "int", "defaultValue": 1 },
        "workerMin": { "type": "int", "defaultValue": 1 },
        "workerMax": { "type": "int", "defaultValue": 1 },
        "condaEnv": { "type": "string", "defaultValue": "py38_tensorflow" },
        "PythonPackages": { "type": "string", "defaultValue": "ray[rllib] gym[atari]" },
        "PublicWebUI": { "type": "bool", "defaultValue": true }
    }
}

Import

This is a standalone JSON ARM template deployed via the Azure CLI or Azure Portal. No Python import is needed.

I/O Contract

Inputs

Parameter Type Default Description
adminUsername string ubuntu Username for the Virtual Machine
publicKey securestring (none) SSH Key for the Virtual Machine
adminPassword securestring (none) Password for the Virtual Machine and JupyterLab
headNodeSize string Standard_D2s_v3 The size of the head-node Virtual Machine
headNodePriority string Regular Priority for head node (Regular, Low, Spot)
workerNodeSize string Standard_D2s_v3 The size of the worker node Virtual Machine
workerNodePriority string Spot Priority for worker nodes (Regular, Low, Spot)
workerInitial int 1 Initial number of worker nodes (0-1000)
workerMin int 1 Minimum number of worker nodes (0-1000)
workerMax int 1 Maximum number of worker nodes (0-1000)
condaEnv string py38_tensorflow Conda environment to select on the DSVM
PythonPackages string ray[rllib] gym[atari] Python packages to install (space separated)
PublicWebUI bool true Whether to open ports for the Ray Web UI and TensorBoard

Outputs

Resource Type Description
Network Security Group Microsoft.Network/networkSecurityGroups NSG with rules for SSH (22), JupyterLab (8000), Ray Web UI (8265), TensorBoard (6006)
Virtual Network Microsoft.Network/virtualNetworks VNet with separate subnets for head (10.33.0.0/16) and workers (10.32.0.0/16)
Public IP Address Microsoft.Network/publicIpAddresses Static public IP for the head node
Head Node VM Microsoft.Compute/virtualMachines DSVM-based head node running Ray
Worker VMSS Microsoft.Compute/virtualMachineScaleSets Auto-scaling worker nodes running Ray
GPU Extensions Microsoft.Compute/virtualMachines/extensions NVIDIA GPU driver extensions (when GPU VMs are used)

Usage Examples

Deploy the template using the Azure CLI:

az deployment group create \
  --resource-group my-ray-cluster-rg \
  --template-file doc/azure/azure-ray-template.json \
  --parameters adminUsername=ubuntu \
               publicKey="$(cat ~/.ssh/id_rsa.pub)" \
               adminPassword="MySecurePassword123!" \
               headNodeSize=Standard_D2s_v3 \
               workerNodeSize=Standard_NC6 \
               workerInitial=2 \
               workerMin=1 \
               workerMax=10 \
               condaEnv=py38_pytorch \
               PythonPackages="ray[rllib] gym[atari]" \
               PublicWebUI=true

Deploy via the Azure Portal by uploading the template in the Custom Deployment blade:

1. Navigate to Azure Portal > Create a resource > Template deployment
2. Click "Build your own template in the editor"
3. Load the azure-ray-template.json file
4. Fill in the parameters and deploy

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment