Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Openai node Beta Realtime Sessions

From Leeroopedia
Revision as of 13:35, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Openai_Openai_node_Beta_Realtime_Sessions.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains SDK, Realtime, Beta
Last Updated 2026-02-15 12:00 GMT

Overview

The Beta Realtime Sessions resource class provides a method to create ephemeral API tokens for client-side Realtime API authentication, along with the full session configuration types.

Description

The Sessions class extends APIResource and exposes a single create method that posts to the /realtime/sessions endpoint. This method creates an ephemeral API token suitable for use in client-side (browser) applications with the Realtime API. It automatically injects the OpenAI-Beta: assistants=v2 header. The method accepts SessionCreateParams and returns a SessionCreateResponse.

The Session interface defines the Realtime session configuration object, including audio format settings (input_audio_format, output_audio_format), input audio noise reduction, input audio transcription settings, system instructions, model selection (e.g., gpt-4o-realtime-preview), modalities (text and/or audio), voice selection (alloy, ash, ballad, coral, echo, sage, shimmer, verse), temperature, tool configuration, tracing options, and turn detection configuration supporting both Server VAD and Semantic VAD modes.

The SessionCreateResponse interface extends the session configuration with a client_secret object containing an ephemeral key (value) and its expiration timestamp (expires_at). The SessionCreateParams interface mirrors the session configuration and adds a client_secret parameter for customizing token expiration (between 10 and 7200 seconds).

Usage

Use this resource to create ephemeral tokens for client-side Realtime API connections. Access it via client.beta.realtime.sessions. The returned client_secret should be used to authenticate WebSocket connections from browser clients.

Code Reference

Source Location

Signature

export class Sessions extends APIResource {
  create(body: SessionCreateParams, options?: RequestOptions): APIPromise<SessionCreateResponse>;
}

export interface Session {
  id?: string;
  input_audio_format?: 'pcm16' | 'g711_ulaw' | 'g711_alaw';
  input_audio_noise_reduction?: Session.InputAudioNoiseReduction;
  input_audio_transcription?: Session.InputAudioTranscription;
  instructions?: string;
  max_response_output_tokens?: number | 'inf';
  modalities?: Array<'text' | 'audio'>;
  model?: string;
  output_audio_format?: 'pcm16' | 'g711_ulaw' | 'g711_alaw';
  speed?: number;
  temperature?: number;
  tool_choice?: string;
  tools?: Array<Session.Tool>;
  tracing?: 'auto' | Session.TracingConfiguration;
  turn_detection?: Session.TurnDetection;
  voice?: string | 'alloy' | 'ash' | 'ballad' | 'coral' | 'echo' | 'sage' | 'shimmer' | 'verse';
}

export interface SessionCreateResponse {
  client_secret: SessionCreateResponse.ClientSecret;
  // ... session configuration fields
}

Import

import OpenAI from 'openai';
// Access via client.beta.realtime.sessions

I/O Contract

Inputs

Name Type Required Description
model string No The Realtime model (e.g., 'gpt-4o-realtime-preview')
modalities 'audio'> No Modalities the model can respond with
voice string No Voice for audio responses (alloy, ash, ballad, coral, echo, sage, shimmer, verse)
instructions string No Default system instructions prepended to model calls
input_audio_format 'g711_ulaw' | 'g711_alaw' No Format of input audio
output_audio_format 'g711_ulaw' | 'g711_alaw' No Format of output audio
input_audio_transcription InputAudioTranscription No Transcription configuration (model, language, prompt)
input_audio_noise_reduction InputAudioNoiseReduction No Noise reduction type (near_field or far_field)
turn_detection TurnDetection No Server VAD or Semantic VAD configuration
tools Array<Tool> No Function tools available to the model
tool_choice string No How the model chooses tools (auto, none, required)
temperature number No Sampling temperature (0.6 to 1.2)
speed number No Spoken response speed (0.25 to 1.5)
max_response_output_tokens 'inf' No Maximum output tokens per response
tracing TracingConfiguration No Tracing configuration for the session
client_secret ClientSecret No Configuration for ephemeral token expiration

Outputs

Name Type Description
SessionCreateResponse SessionCreateResponse Session configuration plus ephemeral client_secret with value and expires_at
client_secret.value string Ephemeral API key for client-side authentication
client_secret.expires_at number Unix timestamp when the ephemeral key expires

Usage Examples

Basic Usage

import OpenAI from 'openai';

const client = new OpenAI();

// Create a realtime session with ephemeral token
const session = await client.beta.realtime.sessions.create({
  model: 'gpt-4o-realtime-preview',
  voice: 'alloy',
  modalities: ['text', 'audio'],
  instructions: 'You are a friendly assistant.',
  input_audio_format: 'pcm16',
  output_audio_format: 'pcm16',
  turn_detection: {
    type: 'semantic_vad',
    eagerness: 'medium',
  },
});

// Use the ephemeral key for client-side WebSocket connection
console.log(session.client_secret.value);
console.log(session.client_secret.expires_at);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment