Overview
EC2_CloudFormation is an AWS CloudFormation template for deploying a single-instance TorchServe server on EC2. It provisions a VPC, an EC2 instance with HTTPS endpoints using self-signed certificates, and CloudWatch monitoring. This is a simpler alternative to the ASG version, intended for development, testing, or low-traffic production scenarios where horizontal scaling is not required.
Description
This CloudFormation template automates the deployment of a single TorchServe instance on AWS EC2. Unlike the ASG version, it does not include an Application Load Balancer, Auto Scaling Group, or EFS shared storage. Instead, it configures HTTPS endpoints directly on the EC2 instance using self-signed certificates, making it suitable for secure development and testing environments.
Key Resources
- VPC: Dedicated Virtual Private Cloud with a public subnet, internet gateway, and security groups
- EC2 Instance: Single instance running TorchServe with all three endpoints (inference, management, metrics) exposed over HTTPS
- Self-Signed Certificates: HTTPS configuration using self-signed TLS certificates generated during instance initialization
- CloudWatch: Basic monitoring for instance health and resource utilization
Comparison with ASG Version
| Feature |
EC2 (Single Instance) |
EC2 ASG (Multi-Instance)
|
| Load Balancer |
No |
ALB with 3 target groups
|
| Auto Scaling |
No |
CPU-based (3-5 instances)
|
| Shared Storage |
No |
EFS model store
|
| HTTPS |
Self-signed certs |
ALB TLS termination
|
| Template Size |
447 lines |
648 lines
|
| Use Case |
Dev/test, low traffic |
Production, high availability
|
Parameters
| Parameter |
Description |
Required
|
KeyName |
EC2 key pair name for SSH access |
Yes
|
ServerCertPassword |
Password for the self-signed certificate keystore |
Yes
|
InstanceType |
EC2 instance type (e.g., g4dn.xlarge) |
Yes
|
Code Reference
Source Location
| File |
Lines |
Repository
|
examples/cloudformation/ec2.yaml |
L1-447 |
pytorch/serve
|
Usage
Deploy the template using the AWS CLI:
# Deploy the single-instance CloudFormation stack
aws cloudformation create-stack \
--stack-name torchserve-single \
--template-body file://examples/cloudformation/ec2.yaml \
--parameters \
ParameterKey=KeyName,ParameterValue=my-key-pair \
ParameterKey=ServerCertPassword,ParameterValue=MySecurePassword123 \
ParameterKey=InstanceType,ParameterValue=g4dn.xlarge \
--capabilities CAPABILITY_IAM
Template Structure (Excerpt)
AWSTemplateFormatVersion: '2010-09-09'
Description: Single-instance TorchServe with HTTPS
Parameters:
KeyName:
Type: AWS::EC2::KeyPair::KeyName
Description: EC2 key pair for SSH access
ServerCertPassword:
Type: String
NoEcho: true
Description: Password for self-signed certificate
InstanceType:
Type: String
Default: g4dn.xlarge
Description: EC2 instance type
Resources:
# VPC and Networking
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.0.0.0/16
PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
# Security Group - open inference, management, metrics ports
TorchServeSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: TorchServe HTTPS endpoints
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 8443 # HTTPS inference
ToPort: 8443
- IpProtocol: tcp
FromPort: 8444 # HTTPS management
ToPort: 8444
- IpProtocol: tcp
FromPort: 8445 # HTTPS metrics
ToPort: 8445
# EC2 Instance with UserData to install TorchServe and generate certs
TorchServeInstance:
Type: AWS::EC2::Instance
Properties:
InstanceType: !Ref InstanceType
KeyName: !Ref KeyName
UserData:
Fn::Base64: !Sub |
#!/bin/bash
# Install TorchServe
pip install torchserve torch-model-archiver
# Generate self-signed certificate
keytool -genkey -keyalg RSA -alias ts \
-keystore keystore.p12 \
-storepass ${ServerCertPassword} \
-storetype PKCS12
# Start TorchServe with HTTPS
torchserve --start --model-store /home/model-store \
--ts-config config.properties
I/O Contract
| Input |
Type |
Description
|
| CloudFormation Parameters |
YAML key-value pairs |
KeyName, ServerCertPassword, InstanceType
|
Usage Examples
Example 1: Deploy and access HTTPS inference
# Get the instance public IP after stack creation
INSTANCE_IP=$(aws cloudformation describe-stacks \
--stack-name torchserve-single \
--query 'Stacks[0].Outputs[?OutputKey==`InstancePublicIP`].OutputValue' \
--output text)
# Send inference request over HTTPS (skip cert verification for self-signed)
curl -k -X POST https://${INSTANCE_IP}:8443/predictions/resnet-18 \
-T image.jpg
Example 2: Register model via HTTPS management endpoint
# Register a model on the single instance
curl -k -X POST "https://${INSTANCE_IP}:8444/models?url=resnet-18.mar&initial_workers=1&synchronous=true"
Example 3: Check metrics
# Retrieve Prometheus metrics
curl -k https://${INSTANCE_IP}:8445/metrics
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.