Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Tensorflow Tfjs Pretrained Model Conversion And Inference

From Leeroopedia
Revision as of 11:05, 16 February 2026 by Admin (talk | contribs) (Auto-imported from workflows/Tensorflow_Tfjs_Pretrained_Model_Conversion_And_Inference.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Model_Deployment, Model_Conversion, Browser_ML
Last Updated 2026-02-10 06:00 GMT

Overview

End-to-end process for converting a Python-trained TensorFlow or Keras model to TensorFlow.js format and running inference in the browser or Node.js.

Description

This workflow bridges the Python ML ecosystem with JavaScript deployment. It uses the tensorflowjs_converter Python tool to convert models from TensorFlow SavedModel, Keras HDF5, TF Hub module, or Flax/JAX formats into a web-optimized JSON + binary weights format. The converted model is then loaded in JavaScript using either tf.loadGraphModel (for SavedModel/TF Hub conversions) or tf.loadLayersModel (for Keras conversions), and predictions are run entirely client-side without a server round-trip.

Usage

Execute this workflow when you have a pre-existing model trained in Python (TensorFlow, Keras, or JAX) and need to deploy it for inference in a web browser, a React Native app, or a Node.js server. This is the standard path for bringing server-trained models to the client side.

Execution Steps

Step 1: Export the Python Model

Save the trained model from Python in one of the supported source formats. For TensorFlow 2.x, save as a SavedModel directory. For Keras, save as an HDF5 file or SavedModel. For TF Hub, identify the module URL. Ensure all custom operations used in the model are supported by TensorFlow.js.

Key considerations:

  • SavedModel format preserves the full computation graph and is the recommended source format
  • Keras HDF5 format preserves architecture and weights but may lose custom objects
  • Check the TensorFlow.js ops compatibility matrix for unsupported operations before converting

Step 2: Install the Converter

Install the tensorflowjs Python pip package, which provides the tensorflowjs_converter CLI tool and the optional tensorflowjs_wizard interactive tool. The converter requires a compatible Python environment with TensorFlow installed.

Key considerations:

  • The converter version should match the target TensorFlow.js runtime version
  • The interactive tensorflowjs_wizard guides through format selection and options

Step 3: Convert the Model

Run the converter CLI specifying the input format, output format, and paths. The converter reads the model graph and weights, applies optional transformations (quantization, weight sharding), and writes the output as a model.json topology file plus one or more .bin weight shard files.

Key considerations:

  • Use --input_format to specify the source format (tf_saved_model, keras, tf_hub, etc.)
  • Use --output_format to specify the target (tfjs_graph_model or tfjs_layers_model)
  • --quantize_uint8 or --quantize_float16 reduces model size significantly
  • --weight_shard_size_bytes controls the size of individual weight files for efficient HTTP loading

Step 4: Host the Converted Model

Deploy the converted model files (model.json and .bin shards) to a web server, CDN, cloud storage bucket, or bundle them with the application. The files must be accessible via HTTP(S) with appropriate CORS headers for browser loading.

Key considerations:

  • All weight shard files must be co-located with model.json at the same base URL
  • Enable CORS headers if the model is served from a different origin than the web application
  • Consider using a CDN for low-latency model delivery to end users

Step 5: Load the Model in JavaScript

Use tf.loadGraphModel for models converted from SavedModel or TF Hub (graph models), or tf.loadLayersModel for models converted from Keras (layers models). Pass the URL to the model.json file. The loader fetches the topology and weight shards, deserializes them, and constructs the in-memory model.

Key considerations:

  • Graph models support predict and execute / executeAsync for multi-output models
  • Layers models support the full Keras API including predict, evaluate, and optionally fit for fine-tuning
  • Loading can use custom IOHandler implementations for non-HTTP sources (IndexedDB, filesystem, etc.)

Step 6: Run Inference

Prepare input tensors matching the model's expected input signature, run prediction via predict (or executeAsync for graph models), and process the output tensors. Dispose of intermediate and output tensors after extracting results to prevent memory leaks.

Key considerations:

  • Input tensors must match the exact shape and dtype expected by the model
  • Use tf.tidy to automatically clean up intermediate tensors
  • For image models, preprocess images to match the training pipeline (resize, normalize, channel order)
  • Graph models may have named inputs and outputs accessible via execute

Execution Diagram

GitHub URL

Workflow Repository