Principle:Tensorflow Tfjs Model Hosting

Knowledge Sources	TensorFlow.js TensorFlow.js Deployment MDN CORS
Domains	Deployment, Infrastructure
Implementation	Implementation:Tensorflow_Tfjs_Model_Hosting_Pattern
Type	Pattern Doc
Last Updated	2026-02-10 00:00 GMT

Overview

Making converted model files accessible over HTTP for browser-based loading. Deploying ML models for web inference requires serving model artifacts via standard web protocols, with proper configuration for cross-origin access, caching, and efficient delivery.

Theory

Browser-Based Model Loading Requirements

Browser-based ML inference with TensorFlow.js requires that model files are fetchable via HTTP(S). Unlike server-side frameworks where models can be loaded from the local filesystem, browser applications are constrained by the web security model and must retrieve model artifacts through network requests.

The fundamental requirements for serving TF.js models are:

HTTP(S) accessibility: The model.json manifest and all weight shard files must be accessible at known URLs
Co-location of artifacts: Weight shard files must be discoverable relative to the model.json URL (same base path)
CORS compliance: Cross-Origin Resource Sharing headers must be configured when the model is served from a different origin than the web application
Correct MIME types: The server should serve .json files as application/json and .bin files as application/octet-stream

The model.json Manifest and Weight Shards

When TensorFlow.js loads a model from a URL, the loading process is:

Fetch model.json: The loader makes an HTTP GET request to the provided URL
Parse weight manifest: The model.json file contains a weightsManifest array that lists weight shard filenames
Resolve shard URLs: Shard filenames are resolved relative to the model.json URL's base path
Fetch weight shards: All .bin shard files are fetched (potentially in parallel)
Deserialize weights: Binary data is deserialized into typed arrays and assigned to model variables

This means the directory structure on the server must preserve the file layout produced by the converter:

/models/my-model/
  model.json              # Manifest file
  group1-shard1of3.bin    # Weight shard 1
  group1-shard2of3.bin    # Weight shard 2
  group1-shard3of3.bin    # Weight shard 3

CORS Configuration

Cross-Origin Resource Sharing (CORS) is critical when the model files are served from a different domain, subdomain, or port than the web application. Without proper CORS headers, the browser will block the model loading requests.

Required CORS headers:

Header	Value	Purpose
Access-Control-Allow-Origin	* or specific origin	Permits cross-origin requests from the web application
Access-Control-Allow-Methods	GET, HEAD, OPTIONS	Permits the HTTP methods used for model loading
Access-Control-Allow-Headers	Content-Type	Permits the request headers sent by the TF.js loader
Access-Control-Expose-Headers	Content-Length	Allows the client to read the response size for progress tracking

Hosting Options

Hosting Option	CORS Support	CDN	Cost	Best For
Google Cloud Storage (GCS)	Built-in configurable	Google CDN	Pay per use	Production, large models
Amazon S3 + CloudFront	Bucket CORS policy	CloudFront CDN	Pay per use	AWS-based deployments
Azure Blob Storage	Built-in configurable	Azure CDN	Pay per use	Azure-based deployments
GitHub Pages	Permissive by default	GitHub CDN	Free (public repos)	Open-source demos
Firebase Hosting	Configurable via firebase.json	Firebase CDN	Free tier available	Firebase-integrated apps
Bundled with application	Same-origin (no CORS needed)	Application CDN	Included	Small models, offline apps
Custom web server	Must configure manually	Optional	Self-hosted	Full control requirements

Caching Strategy

Model files are typically large and change infrequently, making them ideal candidates for aggressive caching:

Weight shard files (.bin): These are immutable once generated. Use long cache TTLs (e.g., Cache-Control: public, max-age=31536000, immutable)
model.json: This changes when the model is updated. Use shorter cache TTLs or cache-busting techniques (e.g., versioned URLs like /models/v2/model.json)
ETags: Enable ETag-based conditional requests for efficient cache revalidation

Compression

Weight shard files (.bin) contain binary floating-point data that does not compress well with standard HTTP compression (gzip/brotli). However, model.json files can benefit significantly from compression due to their text-based JSON content.

Security Considerations

Model intellectual property: Serving models publicly exposes the model architecture and weights. Consider authentication if model IP protection is required
Model integrity: Use HTTPS to prevent man-in-the-middle attacks that could tamper with model weights
Access control: Use the requestInit option in TF.js model loading to pass authentication headers for private model endpoints

Inputs and Outputs

Inputs

Converted model artifacts from the tensorflowjs_converter: model.json file and one or more .bin weight shard files
A web server or cloud storage service configured to serve static files

Outputs

A publicly accessible URL pointing to the model.json file (e.g., https://storage.googleapis.com/my-bucket/models/model.json)
All weight shard files accessible at the same base URL path
CORS headers configured to allow access from the web application's origin

Example Configurations

Google Cloud Storage

# Upload model files to GCS bucket
gsutil cp -r /path/to/converted_model/* gs://my-bucket/models/

# Set CORS configuration
gsutil cors set cors-config.json gs://my-bucket

# Make files publicly readable
gsutil iam ch allUsers:objectViewer gs://my-bucket

Nginx Configuration

server {
    location /models/ {
        root /var/www/;
        add_header Access-Control-Allow-Origin *;
        add_header Access-Control-Allow-Methods "GET, HEAD, OPTIONS";
        add_header Cache-Control "public, max-age=86400";

        # Correct MIME types
        types {
            application/json json;
            application/octet-stream bin;
        }
    }
}

Express.js Static Serving

const express = require('express');
const cors = require('cors');
const app = express();

// Enable CORS for all routes
app.use(cors());

// Serve model files as static assets
app.use('/models', express.static('path/to/converted_model'));

app.listen(3000);

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment