Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Microsoft Onnxruntime Nodejs Session Creation

From Leeroopedia


Field Value
Principle Name Nodejs_Session_Creation
Overview Asynchronous creation of an inference session in Node.js from an ONNX model file or buffer.
Category API Doc
Domains ML_Inference, JavaScript_Integration
Source Repository microsoft/onnxruntime
Last Updated 2026-02-10

Overview

Asynchronous creation of an inference session in Node.js from an ONNX model file or buffer. InferenceSession.create() is the entry point for loading an ONNX model and preparing it for inference in the Node.js runtime environment.

Description

InferenceSession.create() is an async factory method that loads an ONNX model and prepares it for inference. It supports loading from three different source types:

  • File path (string): Loads the model directly from a file on disk.
  • Uint8Array buffer: Loads the model from a Node.js Buffer or Uint8Array containing the serialized model bytes.
  • ArrayBuffer with offset/length: Loads the model from a raw ArrayBuffer with explicit byte offset and byte length parameters, useful when the model is embedded within a larger buffer.

Session options can be provided as an optional second argument to configure runtime behavior. The most commonly used option is intraOpNumThreads, which controls the number of threads used for parallelism within individual operators.

The factory method returns a Promise<InferenceSession> that resolves when the model has been fully loaded, parsed, and optimized. The resulting session object exposes the model's input and output metadata and provides the run() method for executing inference.

Theoretical Basis

The async creation pattern prevents blocking the Node.js event loop during model loading and graph optimization, which can be computationally expensive for large models. Node.js operates on a single-threaded event loop for JavaScript execution, so performing synchronous I/O or computation during model loading would block all other operations in the application.

By returning a Promise, the create() method allows the native ONNX Runtime to perform model loading, graph parsing, and graph optimization on background threads while the event loop remains responsive. This follows the established Node.js convention for async operations and integrates cleanly with async/await syntax.

The factory method pattern (static create() rather than a constructor) is used because JavaScript constructors cannot be asynchronous. This is a common design pattern in JavaScript APIs that require async initialization.

Usage

The session creation step requires a previously installed onnxruntime-node package. The created session is then used for all inference operations:

  1. Import the onnxruntime-node package.
  2. Call InferenceSession.create() with a model path or buffer.
  3. Optionally configure session options such as intraOpNumThreads.
  4. Await the returned Promise to get the initialized InferenceSession.
  5. Use the session for inference via session.run().

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment