Implementation:Pytorch Serve EC2 CloudFormation

Overview

EC2_CloudFormation is an AWS CloudFormation template for deploying a single-instance TorchServe server on EC2. It provisions a VPC, an EC2 instance with HTTPS endpoints using self-signed certificates, and CloudWatch monitoring. This is a simpler alternative to the ASG version, intended for development, testing, or low-traffic production scenarios where horizontal scaling is not required.

Field	Value
Implementation Name	EC2_CloudFormation
Type	Infrastructure as Code
Workflow	Cloud_Deployment
Domains	Cloud_Infrastructure, Model_Serving
Knowledge Sources	Pytorch_Serve
Last Updated	2026-02-13 18:52 GMT

Description

This CloudFormation template automates the deployment of a single TorchServe instance on AWS EC2. Unlike the ASG version, it does not include an Application Load Balancer, Auto Scaling Group, or EFS shared storage. Instead, it configures HTTPS endpoints directly on the EC2 instance using self-signed certificates, making it suitable for secure development and testing environments.

Key Resources

VPC: Dedicated Virtual Private Cloud with a public subnet, internet gateway, and security groups
EC2 Instance: Single instance running TorchServe with all three endpoints (inference, management, metrics) exposed over HTTPS
Self-Signed Certificates: HTTPS configuration using self-signed TLS certificates generated during instance initialization
CloudWatch: Basic monitoring for instance health and resource utilization

Comparison with ASG Version

Feature	EC2 (Single Instance)	EC2 ASG (Multi-Instance)
Load Balancer	No	ALB with 3 target groups
Auto Scaling	No	CPU-based (3-5 instances)
Shared Storage	No	EFS model store
HTTPS	Self-signed certs	ALB TLS termination
Template Size	447 lines	648 lines
Use Case	Dev/test, low traffic	Production, high availability

Parameters

Parameter	Description	Required
`KeyName`	EC2 key pair name for SSH access	Yes
`ServerCertPassword`	Password for the self-signed certificate keystore	Yes
`InstanceType`	EC2 instance type (e.g., `g4dn.xlarge`)	Yes

Code Reference

Source Location

File	Lines	Repository
`examples/cloudformation/ec2.yaml`	L1-447	pytorch/serve

Usage

Deploy the template using the AWS CLI:

# Deploy the single-instance CloudFormation stack
aws cloudformation create-stack \
  --stack-name torchserve-single \
  --template-body file://examples/cloudformation/ec2.yaml \
  --parameters \
    ParameterKey=KeyName,ParameterValue=my-key-pair \
    ParameterKey=ServerCertPassword,ParameterValue=MySecurePassword123 \
    ParameterKey=InstanceType,ParameterValue=g4dn.xlarge \
  --capabilities CAPABILITY_IAM

Template Structure (Excerpt)

AWSTemplateFormatVersion: '2010-09-09'
Description: Single-instance TorchServe with HTTPS

Parameters:
  KeyName:
    Type: AWS::EC2::KeyPair::KeyName
    Description: EC2 key pair for SSH access
  ServerCertPassword:
    Type: String
    NoEcho: true
    Description: Password for self-signed certificate
  InstanceType:
    Type: String
    Default: g4dn.xlarge
    Description: EC2 instance type

Resources:
  # VPC and Networking
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16

  PublicSubnet:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      MapPublicIpOnLaunch: true

  # Security Group - open inference, management, metrics ports
  TorchServeSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: TorchServe HTTPS endpoints
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 8443   # HTTPS inference
          ToPort: 8443
        - IpProtocol: tcp
          FromPort: 8444   # HTTPS management
          ToPort: 8444
        - IpProtocol: tcp
          FromPort: 8445   # HTTPS metrics
          ToPort: 8445

  # EC2 Instance with UserData to install TorchServe and generate certs
  TorchServeInstance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref InstanceType
      KeyName: !Ref KeyName
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          # Install TorchServe
          pip install torchserve torch-model-archiver
          # Generate self-signed certificate
          keytool -genkey -keyalg RSA -alias ts \
            -keystore keystore.p12 \
            -storepass ${ServerCertPassword} \
            -storetype PKCS12
          # Start TorchServe with HTTPS
          torchserve --start --model-store /home/model-store \
            --ts-config config.properties

I/O Contract

Input	Type	Description
CloudFormation Parameters	YAML key-value pairs	`KeyName`, `ServerCertPassword`, `InstanceType`

Output	Type	Description
Instance Public IP	String	Public IP address of the EC2 instance
HTTPS Inference Endpoint	String	`https://{public_ip}:8443` for predictions
HTTPS Management Endpoint	String	`https://{public_ip}:8444` for model management
HTTPS Metrics Endpoint	String	`https://{public_ip}:8445` for metrics

Usage Examples

Example 1: Deploy and access HTTPS inference

# Get the instance public IP after stack creation
INSTANCE_IP=$(aws cloudformation describe-stacks \
  --stack-name torchserve-single \
  --query 'Stacks[0].Outputs[?OutputKey==`InstancePublicIP`].OutputValue' \
  --output text)

# Send inference request over HTTPS (skip cert verification for self-signed)
curl -k -X POST https://${INSTANCE_IP}:8443/predictions/resnet-18 \
  -T image.jpg

Example 2: Register model via HTTPS management endpoint

# Register a model on the single instance
curl -k -X POST "https://${INSTANCE_IP}:8444/models?url=resnet-18.mar&initial_workers=1&synchronous=true"

Example 3: Check metrics

# Retrieve Prometheus metrics
curl -k https://${INSTANCE_IP}:8445/metrics

Related Pages

Principle:Pytorch_Serve_Cloud_Deployment - Cloud deployment principle this template implements
Implementation:Pytorch_Serve_EC2_ASG_CloudFormation - Multi-instance ASG version with ALB and EFS
Implementation:Pytorch_Serve_Management_API - Management API available through HTTPS on port 8444
Implementation:Pytorch_Serve_Metrics_API - Metrics API available through HTTPS on port 8445

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment