Implementation:Tensorflow Tfjs LayersModel Predict

Overview

Tensorflow_Tfjs_LayersModel_Predict documents the TensorFlow.js API for generating predictions from a trained model. The predict() method executes the model's forward pass on new input data to produce output tensors, without computing loss or performing any weight updates.

Principle:Tensorflow_Tfjs_Model_Inference

TensorFlow.js

Deep_Learning Inference

Environment:Tensorflow_Tfjs_Browser_Runtime Environment:Tensorflow_Tfjs_Node_Native_Runtime Heuristic:Tensorflow_Tfjs_Memory_Management_With_Tidy Heuristic:Tensorflow_Tfjs_WebGL_Shader_Warmup Heuristic:Tensorflow_Tfjs_GPU_Pipeline_Data_Residency

Type: API Doc

External Dependencies: @tensorflow/tfjs-core

API Signature

predict(
  x: Tensor | Tensor[],
  args?: ModelPredictArgs
): Tensor | Tensor[]

ModelPredictArgs

Parameter	Type	Default	Description
batchSize	`number`	32	Number of samples per prediction batch. Controls how many inputs are processed in parallel during the forward pass.
verbose	`boolean`	false	Whether to show progress information during prediction

Code Reference

Source file:

tfjs-layers/src/engine/training.ts — Lines 1105-1120

The predict method delegates to the model's internal apply() mechanism, which executes the forward pass through the layer graph. The method processes the input in chunks of batchSize to manage memory consumption, especially important for large input tensors in browser environments. Each layer applies its learned weights in inference mode (dropout disabled, batch normalization uses running statistics).

Import

import * as tf from '@tensorflow/tfjs';

I/O Contract

Inputs

Input	Type	Description
Model	`LayersModel`	A trained model (compilation is not required for predict, unlike evaluate)
x	Tensor[]	Input data tensor(s). A single tensor for single-input models, or an array of tensors for multi-input models. The first dimension is the batch dimension.
args	`ModelPredictArgs`	Optional configuration for batch size and verbosity

Outputs

Output	Type	Description
Single output model	`Tensor`	A single tensor with shape matching the model's output layer. The first dimension corresponds to the batch dimension.
Multi-output model	`Tensor[]`	An array of tensors, one per model output head, each with its respective output shape

The output tensor shape depends on the model architecture:

Model Type	Typical Output Shape	Example
Binary classifier	`[batchSize, 1]`	Sigmoid probability
Multi-class classifier	`[batchSize, numClasses]`	Softmax probabilities
Regression	`[batchSize, outputDim]`	Continuous values
Sequence model	`[batchSize, seqLength, features]`	Per-timestep predictions

Usage Examples

Basic Classification Prediction

const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
const prediction = model.predict(input);
prediction.print();
const probabilities = prediction.dataSync();
console.log('Predicted class:', probabilities.indexOf(Math.max(...probabilities)));

// Clean up
input.dispose();
prediction.dispose();

Batch Prediction

// Predict on multiple samples at once
const batchInput = tf.tensor2d([
  [5.1, 3.5, 1.4, 0.2],
  [6.2, 2.9, 4.3, 1.3],
  [7.7, 3.0, 6.1, 2.3]
]);

const predictions = model.predict(batchInput, {batchSize: 2});
const results = predictions.arraySync();

results.forEach((probs, i) => {
  const classIdx = probs.indexOf(Math.max(...probs));
  console.log(`Sample ${i}: predicted class ${classIdx} (confidence: ${probs[classIdx].toFixed(4)})`);
});

batchInput.dispose();
predictions.dispose();

Using tf.tidy for Automatic Memory Management

// tf.tidy automatically disposes intermediate tensors
const classIndex = tf.tidy(() => {
  const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
  const prediction = model.predict(input);
  return prediction.argMax(-1);  // Returns the index of the max value
});

console.log('Predicted class:', classIndex.dataSync()[0]);
classIndex.dispose();

Regression Prediction

// For a regression model (e.g., predicting house prices)
const features = tf.tensor2d([[1500, 3, 2, 2005]]);  // sqft, beds, baths, year
const pricePrediction = model.predict(features);
console.log('Predicted price: $' + pricePrediction.dataSync()[0].toFixed(2));

features.dispose();
pricePrediction.dispose();

Multi-Input Model Prediction

// For models with multiple input tensors
const imageInput = tf.randomNormal([1, 224, 224, 3]);
const metadataInput = tf.tensor2d([[25, 1, 0.8]]);  // age, category, score

const output = model.predict([imageInput, metadataInput]);
console.log('Prediction shape:', output.shape);
output.print();

imageInput.dispose();
metadataInput.dispose();
output.dispose();

Top-K Predictions

// Get the top 3 most likely classes
const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
const prediction = model.predict(input);

const {values, indices} = tf.topk(prediction, 3);
const topProbs = values.dataSync();
const topClasses = indices.dataSync();

for (let i = 0; i < 3; i++) {
  console.log(`Class ${topClasses[i]}: ${(topProbs[i] * 100).toFixed(2)}%`);
}

input.dispose();
prediction.dispose();
values.dispose();
indices.dispose();

Important Notes

Unlike evaluate(), the predict() method does not require the model to be compiled. It only needs the model's layer graph and trained weights.
The input tensor's first dimension is always the batch dimension. Even for a single prediction, the input must have a batch dimension (e.g., shape [1, 784] not [784]).
Output tensors are not automatically disposed. Always call .dispose() on prediction results or wrap the prediction in tf.tidy() to prevent memory leaks.
For large inputs, adjust batchSize to control memory usage. A smaller batch size uses less memory but may be slower.
The first call to predict() may be slower due to shader compilation (WebGL backend) or graph optimization. Subsequent calls will be faster. Consider a warm-up call with dummy data if latency of the first real prediction is critical.
For the highest throughput, match the batchSize to a power of 2 (e.g., 32, 64, 128) to optimize GPU utilization.

Related Pages

Principle:Tensorflow_Tfjs_Model_Inference -- The principle this implementation realizes
Implementation:Tensorflow_Tfjs_LayersModel_Evaluate -- For computing loss and metrics in addition to predictions
Implementation:Tensorflow_Tfjs_LayersModel_Save -- For persisting a model before or after inference workflows

Environments

Environment:Tensorflow_Tfjs_Browser_Runtime -- Browser runtime (WebGL / WebGPU / WASM / CPU backends)
Environment:Tensorflow_Tfjs_Node_Native_Runtime -- Node.js native runtime (TensorFlow C binding)

Heuristics

Heuristic:Tensorflow_Tfjs_Memory_Management_With_Tidy -- Wrap predictions in tf.tidy() to prevent memory leaks
Heuristic:Tensorflow_Tfjs_WebGL_Shader_Warmup -- Warm up WebGL shaders with a dummy predict call to avoid first-inference latency
Heuristic:Tensorflow_Tfjs_GPU_Pipeline_Data_Residency -- Keep tensor data on GPU to avoid CPU round-trips

2026-02-10 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment