Principle:Microsoft Onnxruntime Nodejs Session Creation
| Field | Value |
|---|---|
| Principle Name | Nodejs_Session_Creation |
| Overview | Asynchronous creation of an inference session in Node.js from an ONNX model file or buffer. |
| Category | API Doc |
| Domains | ML_Inference, JavaScript_Integration |
| Source Repository | microsoft/onnxruntime |
| Last Updated | 2026-02-10 |
Overview
Asynchronous creation of an inference session in Node.js from an ONNX model file or buffer. InferenceSession.create() is the entry point for loading an ONNX model and preparing it for inference in the Node.js runtime environment.
Description
InferenceSession.create() is an async factory method that loads an ONNX model and prepares it for inference. It supports loading from three different source types:
- File path (string): Loads the model directly from a file on disk.
- Uint8Array buffer: Loads the model from a Node.js Buffer or Uint8Array containing the serialized model bytes.
- ArrayBuffer with offset/length: Loads the model from a raw ArrayBuffer with explicit byte offset and byte length parameters, useful when the model is embedded within a larger buffer.
Session options can be provided as an optional second argument to configure runtime behavior. The most commonly used option is intraOpNumThreads, which controls the number of threads used for parallelism within individual operators.
The factory method returns a Promise<InferenceSession> that resolves when the model has been fully loaded, parsed, and optimized. The resulting session object exposes the model's input and output metadata and provides the run() method for executing inference.
Theoretical Basis
The async creation pattern prevents blocking the Node.js event loop during model loading and graph optimization, which can be computationally expensive for large models. Node.js operates on a single-threaded event loop for JavaScript execution, so performing synchronous I/O or computation during model loading would block all other operations in the application.
By returning a Promise, the create() method allows the native ONNX Runtime to perform model loading, graph parsing, and graph optimization on background threads while the event loop remains responsive. This follows the established Node.js convention for async operations and integrates cleanly with async/await syntax.
The factory method pattern (static create() rather than a constructor) is used because JavaScript constructors cannot be asynchronous. This is a common design pattern in JavaScript APIs that require async initialization.
Usage
The session creation step requires a previously installed onnxruntime-node package. The created session is then used for all inference operations:
- Import the onnxruntime-node package.
- Call InferenceSession.create() with a model path or buffer.
- Optionally configure session options such as intraOpNumThreads.
- Await the returned Promise to get the initialized InferenceSession.
- Use the session for inference via session.run().