Implementation:Ray project Ray Azure Ray Template
| Knowledge Sources | Ray |
|---|---|
| Domains | Cloud_Deployment, Infrastructure, Azure |
| Last Updated | 2026-02-13 |
Overview
Azure Ray Template is an Azure Resource Manager (ARM) deployment template that provisions a complete Ray cluster with configurable head and worker virtual machines, networking, and optional GPU support on Microsoft Azure.
Description
This ARM template (azure-ray-template.json) declares all necessary Azure resources to deploy a Ray cluster, including a virtual network with separate subnets for head and worker nodes, a network security group with rules for SSH, JupyterLab, Ray Web UI, and TensorBoard access, public IP addresses, network interfaces, and virtual machine scale sets. The template parameterizes VM sizes, node counts (initial, min, max), spot/regular priority, conda environments, Python packages, and Ray wheel URLs, then invokes an initialization script (azure-init.sh) via the Azure Custom Script Extension to bootstrap Ray on each node. It uses the Microsoft Data Science Virtual Machine (DSVM) Ubuntu 18.04 image as the base OS, providing pre-installed ML frameworks and conda environments such as py38_tensorflow and py38_pytorch.
Usage
Use this template when deploying a Ray cluster on Azure through the Azure Portal or Azure CLI. Modify the parameters section to customize VM sizes, worker node scaling, conda environments, and Python package installations. This template is suitable for users who need a turnkey distributed Ray environment on Azure with auto-scaling worker nodes and GPU support.
Code Reference
Source Location
doc/azure/azure-ray-template.json
Signature
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"adminUsername": { "type": "string", "defaultValue": "ubuntu" },
"publicKey": { "type": "securestring" },
"adminPassword": { "type": "securestring" },
"headNodeSize": { "type": "string", "defaultValue": "Standard_D2s_v3" },
"headNodePriority": { "type": "string", "defaultValue": "Regular" },
"workerNodeSize": { "type": "string", "defaultValue": "Standard_D2s_v3" },
"workerNodePriority": { "type": "string", "defaultValue": "Spot" },
"workerInitial": { "type": "int", "defaultValue": 1 },
"workerMin": { "type": "int", "defaultValue": 1 },
"workerMax": { "type": "int", "defaultValue": 1 },
"condaEnv": { "type": "string", "defaultValue": "py38_tensorflow" },
"PythonPackages": { "type": "string", "defaultValue": "ray[rllib] gym[atari]" },
"PublicWebUI": { "type": "bool", "defaultValue": true }
}
}
Import
This is a standalone JSON ARM template deployed via the Azure CLI or Azure Portal. No Python import is needed.
I/O Contract
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| adminUsername | string | ubuntu | Username for the Virtual Machine |
| publicKey | securestring | (none) | SSH Key for the Virtual Machine |
| adminPassword | securestring | (none) | Password for the Virtual Machine and JupyterLab |
| headNodeSize | string | Standard_D2s_v3 | The size of the head-node Virtual Machine |
| headNodePriority | string | Regular | Priority for head node (Regular, Low, Spot) |
| workerNodeSize | string | Standard_D2s_v3 | The size of the worker node Virtual Machine |
| workerNodePriority | string | Spot | Priority for worker nodes (Regular, Low, Spot) |
| workerInitial | int | 1 | Initial number of worker nodes (0-1000) |
| workerMin | int | 1 | Minimum number of worker nodes (0-1000) |
| workerMax | int | 1 | Maximum number of worker nodes (0-1000) |
| condaEnv | string | py38_tensorflow | Conda environment to select on the DSVM |
| PythonPackages | string | ray[rllib] gym[atari] | Python packages to install (space separated) |
| PublicWebUI | bool | true | Whether to open ports for the Ray Web UI and TensorBoard |
Outputs
| Resource | Type | Description |
|---|---|---|
| Network Security Group | Microsoft.Network/networkSecurityGroups | NSG with rules for SSH (22), JupyterLab (8000), Ray Web UI (8265), TensorBoard (6006) |
| Virtual Network | Microsoft.Network/virtualNetworks | VNet with separate subnets for head (10.33.0.0/16) and workers (10.32.0.0/16) |
| Public IP Address | Microsoft.Network/publicIpAddresses | Static public IP for the head node |
| Head Node VM | Microsoft.Compute/virtualMachines | DSVM-based head node running Ray |
| Worker VMSS | Microsoft.Compute/virtualMachineScaleSets | Auto-scaling worker nodes running Ray |
| GPU Extensions | Microsoft.Compute/virtualMachines/extensions | NVIDIA GPU driver extensions (when GPU VMs are used) |
Usage Examples
Deploy the template using the Azure CLI:
az deployment group create \
--resource-group my-ray-cluster-rg \
--template-file doc/azure/azure-ray-template.json \
--parameters adminUsername=ubuntu \
publicKey="$(cat ~/.ssh/id_rsa.pub)" \
adminPassword="MySecurePassword123!" \
headNodeSize=Standard_D2s_v3 \
workerNodeSize=Standard_NC6 \
workerInitial=2 \
workerMin=1 \
workerMax=10 \
condaEnv=py38_pytorch \
PythonPackages="ray[rllib] gym[atari]" \
PublicWebUI=true
Deploy via the Azure Portal by uploading the template in the Custom Deployment blade:
1. Navigate to Azure Portal > Create a resource > Template deployment
2. Click "Build your own template in the editor"
3. Load the azure-ray-template.json file
4. Fill in the parameters and deploy
Related Pages
- Ray_project_Ray_Sphinx_Configuration - Main Sphinx documentation configuration for Ray
- Ray - Ray project repository