Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Tensorflow Tfjs Model Hosting Pattern

From Leeroopedia


Knowledge Sources
Domains Deployment, Infrastructure
Principle Principle:Tensorflow_Tfjs_Model_Hosting
Type Pattern Doc
Last Updated 2026-02-10 00:00 GMT

Overview

This implementation documents the concrete patterns for hosting converted TensorFlow.js model files (model.json + .bin weight shards) so they are accessible via HTTP(S) for browser-based and Node.js model loading. It covers multiple hosting platforms, CORS configuration, caching strategies, and the file-serving interface contract.

Interface

The hosting pattern exposes a simple HTTP file server interface:

  • GET <base_url>/model.json — Returns the model manifest (JSON)
  • GET <base_url>/group1-shard1ofN.bin — Returns weight shard data (binary)

All files must be co-located at the same base URL path, as TF.js resolves shard filenames relative to the model.json URL.

Inputs and Outputs

Inputs

  • Converted model artifacts from tensorflowjs_converter:
    • model.json — The model manifest file
    • One or more .bin weight shard files (e.g., group1-shard1of3.bin, group1-shard2of3.bin, group1-shard3of3.bin)

Outputs

Required File Structure

The hosting directory must preserve the flat file layout produced by the converter:

# Typical converter output
/path/to/converted_model/
  model.json                  # Model manifest
  group1-shard1of3.bin        # Weight shard 1 (up to 4MB each by default)
  group1-shard2of3.bin        # Weight shard 2
  group1-shard3of3.bin        # Weight shard 3

When hosted, this becomes:

https://example.com/models/my-model/model.json
https://example.com/models/my-model/group1-shard1of3.bin
https://example.com/models/my-model/group1-shard2of3.bin
https://example.com/models/my-model/group1-shard3of3.bin

CORS Configuration

Required Headers

Header Value Purpose
Access-Control-Allow-Origin * or specific origin (e.g., https://myapp.com) Permits the browser to read the response from a different origin
Access-Control-Allow-Methods GET, HEAD, OPTIONS Permits the HTTP methods used by the TF.js fetch calls
Access-Control-Allow-Headers Content-Type, Range Permits request headers that may be sent
Access-Control-Expose-Headers Content-Length, Content-Range Allows the client to read response metadata for progress tracking

Hosting Platform Implementations

Google Cloud Storage (GCS)

# 1. Upload model files to a GCS bucket
gsutil cp -r /path/to/converted_model/* gs://my-bucket/models/v1/

# 2. Create a CORS configuration file
cat > cors-config.json << 'CORS_EOF'
[
  {
    "origin": ["*"],
    "method": ["GET", "HEAD"],
    "responseHeader": ["Content-Type", "Content-Length"],
    "maxAgeSeconds": 3600
  }
]
CORS_EOF

# 3. Apply the CORS configuration
gsutil cors set cors-config.json gs://my-bucket

# 4. Make files publicly readable
gsutil iam ch allUsers:objectViewer gs://my-bucket

# 5. The model URL is now:
# https://storage.googleapis.com/my-bucket/models/v1/model.json

Amazon S3 + CloudFront

# 1. Upload model files to S3
aws s3 sync /path/to/converted_model/ s3://my-bucket/models/v1/

# 2. Configure CORS on the S3 bucket (via AWS Console or CLI)
aws s3api put-bucket-cors --bucket my-bucket --cors-configuration '{
  "CORSRules": [
    {
      "AllowedOrigins": ["*"],
      "AllowedMethods": ["GET", "HEAD"],
      "AllowedHeaders": ["*"],
      "ExposeHeaders": ["Content-Length"],
      "MaxAgeSeconds": 3600
    }
  ]
}'

# 3. Make objects publicly accessible (or use CloudFront OAI)
aws s3api put-bucket-policy --bucket my-bucket --policy '{
  "Statement": [{
    "Effect": "Allow",
    "Principal": "*",
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::my-bucket/models/*"
  }]
}'

# 4. Optionally create a CloudFront distribution for CDN caching
# The model URL is:
# https://my-bucket.s3.amazonaws.com/models/v1/model.json
# or via CloudFront: https://d123abc.cloudfront.net/models/v1/model.json

Firebase Hosting

# 1. Place model files in the Firebase public directory
mkdir -p public/models/v1
cp /path/to/converted_model/* public/models/v1/

# 2. Configure firebase.json for CORS and caching
// firebase.json
{
  "hosting": {
    "public": "public",
    "headers": [
      {
        "source": "/models/**",
        "headers": [
          { "key": "Access-Control-Allow-Origin", "value": "*" },
          { "key": "Cache-Control", "value": "public, max-age=86400" }
        ]
      },
      {
        "source": "/models/**/*.bin",
        "headers": [
          { "key": "Access-Control-Allow-Origin", "value": "*" },
          { "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
        ]
      }
    ]
  }
}
# 3. Deploy
firebase deploy --only hosting

# The model URL is:
# https://my-project.web.app/models/v1/model.json

Nginx Web Server

# nginx.conf or site configuration
server {
    listen 443 ssl;
    server_name models.example.com;

    location /models/ {
        root /var/www/;

        # CORS headers
        add_header Access-Control-Allow-Origin * always;
        add_header Access-Control-Allow-Methods "GET, HEAD, OPTIONS" always;
        add_header Access-Control-Expose-Headers "Content-Length" always;

        # Handle preflight requests
        if ($request_method = OPTIONS) {
            add_header Access-Control-Allow-Origin *;
            add_header Access-Control-Allow-Methods "GET, HEAD, OPTIONS";
            add_header Access-Control-Max-Age 3600;
            add_header Content-Length 0;
            return 204;
        }

        # Caching: long-lived for immutable weight files
        location ~ \.bin$ {
            add_header Access-Control-Allow-Origin * always;
            add_header Cache-Control "public, max-age=31536000, immutable";
        }

        # Caching: shorter for model.json (may be updated)
        location ~ \.json$ {
            add_header Access-Control-Allow-Origin * always;
            add_header Cache-Control "public, max-age=3600";
        }

        # MIME types
        types {
            application/json json;
            application/octet-stream bin;
        }
    }
}

Express.js Application Server

const express = require('express');
const cors = require('cors');
const path = require('path');

const app = express();

// Enable CORS for model routes
app.use('/models', cors({
  origin: '*',
  methods: ['GET', 'HEAD'],
  exposedHeaders: ['Content-Length']
}));

// Serve model files as static assets with caching
app.use('/models', express.static(path.join(__dirname, 'converted_models'), {
  maxAge: '1d',            // Default: 1 day cache
  immutable: false,
  setHeaders: (res, filePath) => {
    // Long cache for immutable weight files
    if (filePath.endsWith('.bin')) {
      res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
    }
  }
}));

app.listen(3000, () => {
  console.log('Model server running on http://localhost:3000');
  console.log('Model URL: http://localhost:3000/models/v1/model.json');
});

Bundled with Web Application

For small models, bundle the model files directly with the web application:

// If model files are in the application's public/static directory:
// public/models/model.json
// public/models/group1-shard1of1.bin

// Load from relative path (same origin, no CORS needed)
const model = await tf.loadGraphModel('/models/model.json');

// For Webpack/bundler environments, use file-loader or copy-webpack-plugin
// to include model files in the build output

Caching Strategy

File Type Cache-Control Header Rationale
model.json public, max-age=3600 (1 hour) May change when model is updated; use versioned URLs for longer caching
.bin weight shards public, max-age=31536000, immutable (1 year) Weight data is immutable for a given model version; safe for aggressive caching
Versioned model.json (e.g., /models/v2/model.json) public, max-age=31536000, immutable When using versioned URL paths, all files can be cached indefinitely

Model Versioning Pattern

Use versioned URL paths to manage model updates without cache invalidation issues:

# Version 1
https://example.com/models/v1/model.json

# Version 2 (updated model, new URL)
https://example.com/models/v2/model.json

# Application code references the current version
const MODEL_VERSION = 'v2';
const model = await tf.loadGraphModel(
  `https://example.com/models/${MODEL_VERSION}/model.json`
);

Verification

After hosting, verify the model is accessible:

# Check model.json is accessible and has correct CORS headers
curl -I https://example.com/models/v1/model.json
# Look for: Access-Control-Allow-Origin: *
# Look for: Content-Type: application/json

# Check a weight shard is accessible
curl -I https://example.com/models/v1/group1-shard1of1.bin
# Look for: Access-Control-Allow-Origin: *
# Look for: Content-Type: application/octet-stream

# Verify model.json content
curl -s https://example.com/models/v1/model.json | python -m json.tool | head -20
// Verify from JavaScript (browser console or Node.js)
const model = await tf.loadGraphModel('https://example.com/models/v1/model.json');
console.log('Model loaded successfully');
console.log('Input shape:', model.inputs[0].shape);
console.log('Output shape:', model.outputs[0].shape);

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment