Implementation:Tensorflow Tfjs Model Hosting Pattern
| Knowledge Sources | |
|---|---|
| Domains | Deployment, Infrastructure |
| Principle | Principle:Tensorflow_Tfjs_Model_Hosting |
| Type | Pattern Doc |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
This implementation documents the concrete patterns for hosting converted TensorFlow.js model files (model.json + .bin weight shards) so they are accessible via HTTP(S) for browser-based and Node.js model loading. It covers multiple hosting platforms, CORS configuration, caching strategies, and the file-serving interface contract.
Interface
The hosting pattern exposes a simple HTTP file server interface:
- GET <base_url>/model.json — Returns the model manifest (JSON)
- GET <base_url>/group1-shard1ofN.bin — Returns weight shard data (binary)
All files must be co-located at the same base URL path, as TF.js resolves shard filenames relative to the model.json URL.
Inputs and Outputs
Inputs
- Converted model artifacts from tensorflowjs_converter:
- model.json — The model manifest file
- One or more .bin weight shard files (e.g., group1-shard1of3.bin, group1-shard2of3.bin, group1-shard3of3.bin)
Outputs
- A publicly accessible URL to the model.json file (e.g., https://storage.googleapis.com/my-bucket/models/v1/model.json)
- All referenced weight shard files accessible at the same base URL
- Appropriate CORS headers configured for cross-origin browser access
Required File Structure
The hosting directory must preserve the flat file layout produced by the converter:
# Typical converter output
/path/to/converted_model/
model.json # Model manifest
group1-shard1of3.bin # Weight shard 1 (up to 4MB each by default)
group1-shard2of3.bin # Weight shard 2
group1-shard3of3.bin # Weight shard 3
When hosted, this becomes:
https://example.com/models/my-model/model.json
https://example.com/models/my-model/group1-shard1of3.bin
https://example.com/models/my-model/group1-shard2of3.bin
https://example.com/models/my-model/group1-shard3of3.bin
CORS Configuration
Required Headers
| Header | Value | Purpose |
|---|---|---|
| Access-Control-Allow-Origin | * or specific origin (e.g., https://myapp.com) | Permits the browser to read the response from a different origin |
| Access-Control-Allow-Methods | GET, HEAD, OPTIONS | Permits the HTTP methods used by the TF.js fetch calls |
| Access-Control-Allow-Headers | Content-Type, Range | Permits request headers that may be sent |
| Access-Control-Expose-Headers | Content-Length, Content-Range | Allows the client to read response metadata for progress tracking |
Hosting Platform Implementations
Google Cloud Storage (GCS)
# 1. Upload model files to a GCS bucket
gsutil cp -r /path/to/converted_model/* gs://my-bucket/models/v1/
# 2. Create a CORS configuration file
cat > cors-config.json << 'CORS_EOF'
[
{
"origin": ["*"],
"method": ["GET", "HEAD"],
"responseHeader": ["Content-Type", "Content-Length"],
"maxAgeSeconds": 3600
}
]
CORS_EOF
# 3. Apply the CORS configuration
gsutil cors set cors-config.json gs://my-bucket
# 4. Make files publicly readable
gsutil iam ch allUsers:objectViewer gs://my-bucket
# 5. The model URL is now:
# https://storage.googleapis.com/my-bucket/models/v1/model.json
Amazon S3 + CloudFront
# 1. Upload model files to S3
aws s3 sync /path/to/converted_model/ s3://my-bucket/models/v1/
# 2. Configure CORS on the S3 bucket (via AWS Console or CLI)
aws s3api put-bucket-cors --bucket my-bucket --cors-configuration '{
"CORSRules": [
{
"AllowedOrigins": ["*"],
"AllowedMethods": ["GET", "HEAD"],
"AllowedHeaders": ["*"],
"ExposeHeaders": ["Content-Length"],
"MaxAgeSeconds": 3600
}
]
}'
# 3. Make objects publicly accessible (or use CloudFront OAI)
aws s3api put-bucket-policy --bucket my-bucket --policy '{
"Statement": [{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/models/*"
}]
}'
# 4. Optionally create a CloudFront distribution for CDN caching
# The model URL is:
# https://my-bucket.s3.amazonaws.com/models/v1/model.json
# or via CloudFront: https://d123abc.cloudfront.net/models/v1/model.json
Firebase Hosting
# 1. Place model files in the Firebase public directory
mkdir -p public/models/v1
cp /path/to/converted_model/* public/models/v1/
# 2. Configure firebase.json for CORS and caching
// firebase.json
{
"hosting": {
"public": "public",
"headers": [
{
"source": "/models/**",
"headers": [
{ "key": "Access-Control-Allow-Origin", "value": "*" },
{ "key": "Cache-Control", "value": "public, max-age=86400" }
]
},
{
"source": "/models/**/*.bin",
"headers": [
{ "key": "Access-Control-Allow-Origin", "value": "*" },
{ "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
]
}
]
}
}
# 3. Deploy
firebase deploy --only hosting
# The model URL is:
# https://my-project.web.app/models/v1/model.json
Nginx Web Server
# nginx.conf or site configuration
server {
listen 443 ssl;
server_name models.example.com;
location /models/ {
root /var/www/;
# CORS headers
add_header Access-Control-Allow-Origin * always;
add_header Access-Control-Allow-Methods "GET, HEAD, OPTIONS" always;
add_header Access-Control-Expose-Headers "Content-Length" always;
# Handle preflight requests
if ($request_method = OPTIONS) {
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods "GET, HEAD, OPTIONS";
add_header Access-Control-Max-Age 3600;
add_header Content-Length 0;
return 204;
}
# Caching: long-lived for immutable weight files
location ~ \.bin$ {
add_header Access-Control-Allow-Origin * always;
add_header Cache-Control "public, max-age=31536000, immutable";
}
# Caching: shorter for model.json (may be updated)
location ~ \.json$ {
add_header Access-Control-Allow-Origin * always;
add_header Cache-Control "public, max-age=3600";
}
# MIME types
types {
application/json json;
application/octet-stream bin;
}
}
}
Express.js Application Server
const express = require('express');
const cors = require('cors');
const path = require('path');
const app = express();
// Enable CORS for model routes
app.use('/models', cors({
origin: '*',
methods: ['GET', 'HEAD'],
exposedHeaders: ['Content-Length']
}));
// Serve model files as static assets with caching
app.use('/models', express.static(path.join(__dirname, 'converted_models'), {
maxAge: '1d', // Default: 1 day cache
immutable: false,
setHeaders: (res, filePath) => {
// Long cache for immutable weight files
if (filePath.endsWith('.bin')) {
res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
}
}
}));
app.listen(3000, () => {
console.log('Model server running on http://localhost:3000');
console.log('Model URL: http://localhost:3000/models/v1/model.json');
});
Bundled with Web Application
For small models, bundle the model files directly with the web application:
// If model files are in the application's public/static directory:
// public/models/model.json
// public/models/group1-shard1of1.bin
// Load from relative path (same origin, no CORS needed)
const model = await tf.loadGraphModel('/models/model.json');
// For Webpack/bundler environments, use file-loader or copy-webpack-plugin
// to include model files in the build output
Caching Strategy
| File Type | Cache-Control Header | Rationale |
|---|---|---|
| model.json | public, max-age=3600 (1 hour) | May change when model is updated; use versioned URLs for longer caching |
| .bin weight shards | public, max-age=31536000, immutable (1 year) | Weight data is immutable for a given model version; safe for aggressive caching |
| Versioned model.json (e.g., /models/v2/model.json) | public, max-age=31536000, immutable | When using versioned URL paths, all files can be cached indefinitely |
Model Versioning Pattern
Use versioned URL paths to manage model updates without cache invalidation issues:
# Version 1
https://example.com/models/v1/model.json
# Version 2 (updated model, new URL)
https://example.com/models/v2/model.json
# Application code references the current version
const MODEL_VERSION = 'v2';
const model = await tf.loadGraphModel(
`https://example.com/models/${MODEL_VERSION}/model.json`
);
Verification
After hosting, verify the model is accessible:
# Check model.json is accessible and has correct CORS headers
curl -I https://example.com/models/v1/model.json
# Look for: Access-Control-Allow-Origin: *
# Look for: Content-Type: application/json
# Check a weight shard is accessible
curl -I https://example.com/models/v1/group1-shard1of1.bin
# Look for: Access-Control-Allow-Origin: *
# Look for: Content-Type: application/octet-stream
# Verify model.json content
curl -s https://example.com/models/v1/model.json | python -m json.tool | head -20
// Verify from JavaScript (browser console or Node.js)
const model = await tf.loadGraphModel('https://example.com/models/v1/model.json');
console.log('Model loaded successfully');
console.log('Input shape:', model.inputs[0].shape);
console.log('Output shape:', model.outputs[0].shape);
See Also
- Principle:Tensorflow_Tfjs_Model_Hosting — The principle this implementation fulfills
- Implementation:Tensorflow_Tfjs_Tensorflowjs_Converter_CLI — Previous step: converting the model
- Implementation:Tensorflow_Tfjs_Tf_LoadGraphModel — Next step: loading the hosted model in JavaScript