Workflow:Tensorflow Tfjs Pretrained Model Conversion And Inference

Knowledge Sources	TensorFlow.js TensorFlow.js Converter TensorFlow.js API
Domains	Model_Deployment, Model_Conversion, Browser_ML
Last Updated	2026-02-10 06:00 GMT

Overview

End-to-end process for converting a Python-trained TensorFlow or Keras model to TensorFlow.js format and running inference in the browser or Node.js.

Description

This workflow bridges the Python ML ecosystem with JavaScript deployment. It uses the tensorflowjs_converter Python tool to convert models from TensorFlow SavedModel, Keras HDF5, TF Hub module, or Flax/JAX formats into a web-optimized JSON + binary weights format. The converted model is then loaded in JavaScript using either tf.loadGraphModel (for SavedModel/TF Hub conversions) or tf.loadLayersModel (for Keras conversions), and predictions are run entirely client-side without a server round-trip.

Usage

Execute this workflow when you have a pre-existing model trained in Python (TensorFlow, Keras, or JAX) and need to deploy it for inference in a web browser, a React Native app, or a Node.js server. This is the standard path for bringing server-trained models to the client side.

Execution Steps

Step 1: Export the Python Model

Save the trained model from Python in one of the supported source formats. For TensorFlow 2.x, save as a SavedModel directory. For Keras, save as an HDF5 file or SavedModel. For TF Hub, identify the module URL. Ensure all custom operations used in the model are supported by TensorFlow.js.

Key considerations:

SavedModel format preserves the full computation graph and is the recommended source format
Keras HDF5 format preserves architecture and weights but may lose custom objects
Check the TensorFlow.js ops compatibility matrix for unsupported operations before converting

Step 2: Install the Converter

Install the tensorflowjs Python pip package, which provides the tensorflowjs_converter CLI tool and the optional tensorflowjs_wizard interactive tool. The converter requires a compatible Python environment with TensorFlow installed.

Key considerations:

The converter version should match the target TensorFlow.js runtime version
The interactive tensorflowjs_wizard guides through format selection and options

Step 3: Convert the Model

Run the converter CLI specifying the input format, output format, and paths. The converter reads the model graph and weights, applies optional transformations (quantization, weight sharding), and writes the output as a model.json topology file plus one or more .bin weight shard files.

Key considerations:

Use --input_format to specify the source format (tf_saved_model, keras, tf_hub, etc.)
Use --output_format to specify the target (tfjs_graph_model or tfjs_layers_model)
--quantize_uint8 or --quantize_float16 reduces model size significantly
--weight_shard_size_bytes controls the size of individual weight files for efficient HTTP loading

Step 4: Host the Converted Model

Deploy the converted model files (model.json and .bin shards) to a web server, CDN, cloud storage bucket, or bundle them with the application. The files must be accessible via HTTP(S) with appropriate CORS headers for browser loading.

Key considerations:

All weight shard files must be co-located with model.json at the same base URL
Enable CORS headers if the model is served from a different origin than the web application
Consider using a CDN for low-latency model delivery to end users

Step 5: Load the Model in JavaScript

Use tf.loadGraphModel for models converted from SavedModel or TF Hub (graph models), or tf.loadLayersModel for models converted from Keras (layers models). Pass the URL to the model.json file. The loader fetches the topology and weight shards, deserializes them, and constructs the in-memory model.

Key considerations:

Graph models support predict and execute / executeAsync for multi-output models
Layers models support the full Keras API including predict, evaluate, and optionally fit for fine-tuning
Loading can use custom IOHandler implementations for non-HTTP sources (IndexedDB, filesystem, etc.)

Step 6: Run Inference

Prepare input tensors matching the model's expected input signature, run prediction via predict (or executeAsync for graph models), and process the output tensors. Dispose of intermediate and output tensors after extracting results to prevent memory leaks.

Key considerations:

Input tensors must match the exact shape and dtype expected by the model
Use tf.tidy to automatically clean up intermediate tensors
For image models, preprocess images to match the training pipeline (resize, normalize, channel order)
Graph models may have named inputs and outputs accessible via execute

Execution Diagram

GitHub URL

Workflow Repository