Implementation:Tensorflow Tfjs LayersModel Predict
Overview
Tensorflow_Tfjs_LayersModel_Predict documents the TensorFlow.js API for generating predictions from a trained model. The predict() method executes the model's forward pass on new input data to produce output tensors, without computing loss or performing any weight updates.
Principle:Tensorflow_Tfjs_Model_Inference
Environment:Tensorflow_Tfjs_Browser_Runtime Environment:Tensorflow_Tfjs_Node_Native_Runtime Heuristic:Tensorflow_Tfjs_Memory_Management_With_Tidy Heuristic:Tensorflow_Tfjs_WebGL_Shader_Warmup Heuristic:Tensorflow_Tfjs_GPU_Pipeline_Data_Residency
Type: API Doc
External Dependencies: @tensorflow/tfjs-core
API Signature
predict(
x: Tensor | Tensor[],
args?: ModelPredictArgs
): Tensor | Tensor[]
ModelPredictArgs
| Parameter | Type | Default | Description |
|---|---|---|---|
| batchSize | number |
32 | Number of samples per prediction batch. Controls how many inputs are processed in parallel during the forward pass. |
| verbose | boolean |
false | Whether to show progress information during prediction |
Code Reference
Source file:
tfjs-layers/src/engine/training.ts— Lines 1105-1120
The predict method delegates to the model's internal apply() mechanism, which executes the forward pass through the layer graph. The method processes the input in chunks of batchSize to manage memory consumption, especially important for large input tensors in browser environments. Each layer applies its learned weights in inference mode (dropout disabled, batch normalization uses running statistics).
Import
import * as tf from '@tensorflow/tfjs';
I/O Contract
Inputs
| Input | Type | Description |
|---|---|---|
| Model | LayersModel |
A trained model (compilation is not required for predict, unlike evaluate) |
| x | Tensor[] | Input data tensor(s). A single tensor for single-input models, or an array of tensors for multi-input models. The first dimension is the batch dimension. |
| args | ModelPredictArgs |
Optional configuration for batch size and verbosity |
Outputs
| Output | Type | Description |
|---|---|---|
| Single output model | Tensor |
A single tensor with shape matching the model's output layer. The first dimension corresponds to the batch dimension. |
| Multi-output model | Tensor[] |
An array of tensors, one per model output head, each with its respective output shape |
The output tensor shape depends on the model architecture:
| Model Type | Typical Output Shape | Example |
|---|---|---|
| Binary classifier | [batchSize, 1] |
Sigmoid probability |
| Multi-class classifier | [batchSize, numClasses] |
Softmax probabilities |
| Regression | [batchSize, outputDim] |
Continuous values |
| Sequence model | [batchSize, seqLength, features] |
Per-timestep predictions |
Usage Examples
Basic Classification Prediction
const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
const prediction = model.predict(input);
prediction.print();
const probabilities = prediction.dataSync();
console.log('Predicted class:', probabilities.indexOf(Math.max(...probabilities)));
// Clean up
input.dispose();
prediction.dispose();
Batch Prediction
// Predict on multiple samples at once
const batchInput = tf.tensor2d([
[5.1, 3.5, 1.4, 0.2],
[6.2, 2.9, 4.3, 1.3],
[7.7, 3.0, 6.1, 2.3]
]);
const predictions = model.predict(batchInput, {batchSize: 2});
const results = predictions.arraySync();
results.forEach((probs, i) => {
const classIdx = probs.indexOf(Math.max(...probs));
console.log(`Sample ${i}: predicted class ${classIdx} (confidence: ${probs[classIdx].toFixed(4)})`);
});
batchInput.dispose();
predictions.dispose();
Using tf.tidy for Automatic Memory Management
// tf.tidy automatically disposes intermediate tensors
const classIndex = tf.tidy(() => {
const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
const prediction = model.predict(input);
return prediction.argMax(-1); // Returns the index of the max value
});
console.log('Predicted class:', classIndex.dataSync()[0]);
classIndex.dispose();
Regression Prediction
// For a regression model (e.g., predicting house prices)
const features = tf.tensor2d([[1500, 3, 2, 2005]]); // sqft, beds, baths, year
const pricePrediction = model.predict(features);
console.log('Predicted price: $' + pricePrediction.dataSync()[0].toFixed(2));
features.dispose();
pricePrediction.dispose();
Multi-Input Model Prediction
// For models with multiple input tensors
const imageInput = tf.randomNormal([1, 224, 224, 3]);
const metadataInput = tf.tensor2d([[25, 1, 0.8]]); // age, category, score
const output = model.predict([imageInput, metadataInput]);
console.log('Prediction shape:', output.shape);
output.print();
imageInput.dispose();
metadataInput.dispose();
output.dispose();
Top-K Predictions
// Get the top 3 most likely classes
const input = tf.tensor2d([[5.1, 3.5, 1.4, 0.2]]);
const prediction = model.predict(input);
const {values, indices} = tf.topk(prediction, 3);
const topProbs = values.dataSync();
const topClasses = indices.dataSync();
for (let i = 0; i < 3; i++) {
console.log(`Class ${topClasses[i]}: ${(topProbs[i] * 100).toFixed(2)}%`);
}
input.dispose();
prediction.dispose();
values.dispose();
indices.dispose();
Important Notes
- Unlike
evaluate(), thepredict()method does not require the model to be compiled. It only needs the model's layer graph and trained weights. - The input tensor's first dimension is always the batch dimension. Even for a single prediction, the input must have a batch dimension (e.g., shape
[1, 784]not[784]). - Output tensors are not automatically disposed. Always call
.dispose()on prediction results or wrap the prediction intf.tidy()to prevent memory leaks. - For large inputs, adjust
batchSizeto control memory usage. A smaller batch size uses less memory but may be slower. - The first call to
predict()may be slower due to shader compilation (WebGL backend) or graph optimization. Subsequent calls will be faster. Consider a warm-up call with dummy data if latency of the first real prediction is critical. - For the highest throughput, match the
batchSizeto a power of 2 (e.g., 32, 64, 128) to optimize GPU utilization.
Related Pages
- Principle:Tensorflow_Tfjs_Model_Inference -- The principle this implementation realizes
- Implementation:Tensorflow_Tfjs_LayersModel_Evaluate -- For computing loss and metrics in addition to predictions
- Implementation:Tensorflow_Tfjs_LayersModel_Save -- For persisting a model before or after inference workflows
Environments
- Environment:Tensorflow_Tfjs_Browser_Runtime -- Browser runtime (WebGL / WebGPU / WASM / CPU backends)
- Environment:Tensorflow_Tfjs_Node_Native_Runtime -- Node.js native runtime (TensorFlow C binding)
Heuristics
- Heuristic:Tensorflow_Tfjs_Memory_Management_With_Tidy -- Wrap predictions in tf.tidy() to prevent memory leaks
- Heuristic:Tensorflow_Tfjs_WebGL_Shader_Warmup -- Warm up WebGL shaders with a dummy predict call to avoid first-inference latency
- Heuristic:Tensorflow_Tfjs_GPU_Pipeline_Data_Residency -- Keep tensor data on GPU to avoid CPU round-trips