Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Pytorch Serve EC2 CloudFormation

From Leeroopedia

Overview

EC2_CloudFormation is an AWS CloudFormation template for deploying a single-instance TorchServe server on EC2. It provisions a VPC, an EC2 instance with HTTPS endpoints using self-signed certificates, and CloudWatch monitoring. This is a simpler alternative to the ASG version, intended for development, testing, or low-traffic production scenarios where horizontal scaling is not required.

Field Value
Implementation Name EC2_CloudFormation
Type Infrastructure as Code
Workflow Cloud_Deployment
Domains Cloud_Infrastructure, Model_Serving
Knowledge Sources Pytorch_Serve
Last Updated 2026-02-13 18:52 GMT

Description

This CloudFormation template automates the deployment of a single TorchServe instance on AWS EC2. Unlike the ASG version, it does not include an Application Load Balancer, Auto Scaling Group, or EFS shared storage. Instead, it configures HTTPS endpoints directly on the EC2 instance using self-signed certificates, making it suitable for secure development and testing environments.

Key Resources

  • VPC: Dedicated Virtual Private Cloud with a public subnet, internet gateway, and security groups
  • EC2 Instance: Single instance running TorchServe with all three endpoints (inference, management, metrics) exposed over HTTPS
  • Self-Signed Certificates: HTTPS configuration using self-signed TLS certificates generated during instance initialization
  • CloudWatch: Basic monitoring for instance health and resource utilization

Comparison with ASG Version

Feature EC2 (Single Instance) EC2 ASG (Multi-Instance)
Load Balancer No ALB with 3 target groups
Auto Scaling No CPU-based (3-5 instances)
Shared Storage No EFS model store
HTTPS Self-signed certs ALB TLS termination
Template Size 447 lines 648 lines
Use Case Dev/test, low traffic Production, high availability

Parameters

Parameter Description Required
KeyName EC2 key pair name for SSH access Yes
ServerCertPassword Password for the self-signed certificate keystore Yes
InstanceType EC2 instance type (e.g., g4dn.xlarge) Yes

Code Reference

Source Location

File Lines Repository
examples/cloudformation/ec2.yaml L1-447 pytorch/serve

Usage

Deploy the template using the AWS CLI:

# Deploy the single-instance CloudFormation stack
aws cloudformation create-stack \
  --stack-name torchserve-single \
  --template-body file://examples/cloudformation/ec2.yaml \
  --parameters \
    ParameterKey=KeyName,ParameterValue=my-key-pair \
    ParameterKey=ServerCertPassword,ParameterValue=MySecurePassword123 \
    ParameterKey=InstanceType,ParameterValue=g4dn.xlarge \
  --capabilities CAPABILITY_IAM

Template Structure (Excerpt)

AWSTemplateFormatVersion: '2010-09-09'
Description: Single-instance TorchServe with HTTPS

Parameters:
  KeyName:
    Type: AWS::EC2::KeyPair::KeyName
    Description: EC2 key pair for SSH access
  ServerCertPassword:
    Type: String
    NoEcho: true
    Description: Password for self-signed certificate
  InstanceType:
    Type: String
    Default: g4dn.xlarge
    Description: EC2 instance type

Resources:
  # VPC and Networking
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16

  PublicSubnet:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      MapPublicIpOnLaunch: true

  # Security Group - open inference, management, metrics ports
  TorchServeSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: TorchServe HTTPS endpoints
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 8443   # HTTPS inference
          ToPort: 8443
        - IpProtocol: tcp
          FromPort: 8444   # HTTPS management
          ToPort: 8444
        - IpProtocol: tcp
          FromPort: 8445   # HTTPS metrics
          ToPort: 8445

  # EC2 Instance with UserData to install TorchServe and generate certs
  TorchServeInstance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref InstanceType
      KeyName: !Ref KeyName
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          # Install TorchServe
          pip install torchserve torch-model-archiver
          # Generate self-signed certificate
          keytool -genkey -keyalg RSA -alias ts \
            -keystore keystore.p12 \
            -storepass ${ServerCertPassword} \
            -storetype PKCS12
          # Start TorchServe with HTTPS
          torchserve --start --model-store /home/model-store \
            --ts-config config.properties

I/O Contract

Input Type Description
CloudFormation Parameters YAML key-value pairs KeyName, ServerCertPassword, InstanceType
Output Type Description
Instance Public IP String Public IP address of the EC2 instance
HTTPS Inference Endpoint String https://{public_ip}:8443 for predictions
HTTPS Management Endpoint String https://{public_ip}:8444 for model management
HTTPS Metrics Endpoint String https://{public_ip}:8445 for metrics

Usage Examples

Example 1: Deploy and access HTTPS inference

# Get the instance public IP after stack creation
INSTANCE_IP=$(aws cloudformation describe-stacks \
  --stack-name torchserve-single \
  --query 'Stacks[0].Outputs[?OutputKey==`InstancePublicIP`].OutputValue' \
  --output text)

# Send inference request over HTTPS (skip cert verification for self-signed)
curl -k -X POST https://${INSTANCE_IP}:8443/predictions/resnet-18 \
  -T image.jpg

Example 2: Register model via HTTPS management endpoint

# Register a model on the single instance
curl -k -X POST "https://${INSTANCE_IP}:8444/models?url=resnet-18.mar&initial_workers=1&synchronous=true"

Example 3: Check metrics

# Retrieve Prometheus metrics
curl -k https://${INSTANCE_IP}:8445/metrics

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment