Environment:Tensorflow Serving Kubernetes Deployment Environment

Knowledge Sources	TensorFlow Serving Kubernetes Tutorial
Domains	Infrastructure, Kubernetes, Cloud
Last Updated	2026-02-13 17:00 GMT

Overview

Kubernetes (GKE) deployment environment with Docker, Google Cloud SDK, and `kubectl` for orchestrating TensorFlow Serving at scale.

Description

This environment provides the tools and infrastructure for deploying TensorFlow Serving on Kubernetes, specifically Google Kubernetes Engine (GKE). It requires Docker for building serving images, the Google Cloud SDK (`gcloud`) for cluster management and container registry access, and `kubectl` for deploying Kubernetes resources. The deployment pattern uses a Deployment with 3 replicas behind a LoadBalancer Service, as demonstrated in the ResNet example.

Usage

Use this environment when deploying TensorFlow Serving to production Kubernetes clusters for scalable, load-balanced model serving. This is the prerequisite for the entire Kubernetes Deployment workflow, including image building, registry push, cluster creation, and resource deployment.

System Requirements

Category	Requirement	Notes
Container Runtime	Docker Engine	For building serving images locally
Cloud SDK	Google Cloud SDK (`gcloud`)	For GKE cluster management and Container Registry
Kubernetes CLI	`kubectl`	For deploying Kubernetes manifests
Container Registry	Google Container Registry (GCR) or equivalent	For storing serving images
Cluster	Kubernetes cluster (GKE recommended)	Example uses `--num-nodes=5`

Dependencies

CLI Tools

`docker` (for image building)
`gcloud` (Google Cloud SDK)
`kubectl` (Kubernetes CLI, bundled with gcloud)

Kubernetes Resources

Deployment: TensorFlow Serving pods (default: 3 replicas)
Service: LoadBalancer type for external access
Container port: 8500 (gRPC)

Credentials

The following credentials are required:

`GOOGLE_CLOUD_PROJECT`: GCP project ID for GKE cluster and Container Registry
GCP authentication via `gcloud auth login`
Docker authentication via `gcloud auth configure-docker`

Quick Install

# Install Google Cloud SDK
curl https://sdk.cloud.google.com | bash

# Authenticate
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# Install kubectl
gcloud components install kubectl

# Create GKE cluster
gcloud container clusters create serving-cluster --num-nodes 5

# Get cluster credentials
gcloud container clusters get-credentials serving-cluster

Code Evidence

GKE cluster creation from `serving_kubernetes.md:168-218`:

gcloud container clusters create serving-cluster --num-nodes 5
gcloud container clusters get-credentials serving-cluster

Docker image push to GCR from `serving_kubernetes.md:220-243`:

docker tag $USER/resnet_serving gcr.io/YOUR_PROJECT/resnet_serving
docker push gcr.io/YOUR_PROJECT/resnet_serving

Kubernetes manifest from `resnet_k8s.yaml:1-49`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: resnet-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: resnet-server
  template:
    spec:
      containers:
      - name: resnet-container
        image: gcr.io/YOUR_PROJECT/resnet_serving
        ports:
        - containerPort: 8500
---
apiVersion: v1
kind: Service
metadata:
  name: resnet-service
spec:
  type: LoadBalancer
  ports:
  - port: 8500
    targetPort: 8500

Common Errors

Error Message	Cause	Solution
`ERROR: (gcloud.container.clusters.create) ... quota exceeded`	GCP quota insufficient	Request quota increase in GCP Console
`ImagePullBackOff`	Container image not found in registry	Verify `docker push` succeeded and image name matches manifest
`CrashLoopBackOff`	Model not found in container	Ensure model was copied into Docker image during build step

Compatibility Notes

GKE specific: The tutorial targets GKE but the Kubernetes manifests are portable to other providers (EKS, AKS) with appropriate registry and cluster setup changes.
Scaling: TensorFlow Serving performance is better on fewer, larger machines due to resource sharing efficiency and lower fixed costs (see performance.md).
GPU on K8s: For GPU serving on Kubernetes, install the NVIDIA device plugin and use GPU-enabled serving images.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment