Implementation:Openai Openai node Beta Realtime Sessions
| Knowledge Sources | |
|---|---|
| Domains | SDK, Realtime, Beta |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The Beta Realtime Sessions resource class provides a method to create ephemeral API tokens for client-side Realtime API authentication, along with the full session configuration types.
Description
The Sessions class extends APIResource and exposes a single create method that posts to the /realtime/sessions endpoint. This method creates an ephemeral API token suitable for use in client-side (browser) applications with the Realtime API. It automatically injects the OpenAI-Beta: assistants=v2 header. The method accepts SessionCreateParams and returns a SessionCreateResponse.
The Session interface defines the Realtime session configuration object, including audio format settings (input_audio_format, output_audio_format), input audio noise reduction, input audio transcription settings, system instructions, model selection (e.g., gpt-4o-realtime-preview), modalities (text and/or audio), voice selection (alloy, ash, ballad, coral, echo, sage, shimmer, verse), temperature, tool configuration, tracing options, and turn detection configuration supporting both Server VAD and Semantic VAD modes.
The SessionCreateResponse interface extends the session configuration with a client_secret object containing an ephemeral key (value) and its expiration timestamp (expires_at). The SessionCreateParams interface mirrors the session configuration and adds a client_secret parameter for customizing token expiration (between 10 and 7200 seconds).
Usage
Use this resource to create ephemeral tokens for client-side Realtime API connections. Access it via client.beta.realtime.sessions. The returned client_secret should be used to authenticate WebSocket connections from browser clients.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/beta/realtime/sessions.ts
Signature
export class Sessions extends APIResource {
create(body: SessionCreateParams, options?: RequestOptions): APIPromise<SessionCreateResponse>;
}
export interface Session {
id?: string;
input_audio_format?: 'pcm16' | 'g711_ulaw' | 'g711_alaw';
input_audio_noise_reduction?: Session.InputAudioNoiseReduction;
input_audio_transcription?: Session.InputAudioTranscription;
instructions?: string;
max_response_output_tokens?: number | 'inf';
modalities?: Array<'text' | 'audio'>;
model?: string;
output_audio_format?: 'pcm16' | 'g711_ulaw' | 'g711_alaw';
speed?: number;
temperature?: number;
tool_choice?: string;
tools?: Array<Session.Tool>;
tracing?: 'auto' | Session.TracingConfiguration;
turn_detection?: Session.TurnDetection;
voice?: string | 'alloy' | 'ash' | 'ballad' | 'coral' | 'echo' | 'sage' | 'shimmer' | 'verse';
}
export interface SessionCreateResponse {
client_secret: SessionCreateResponse.ClientSecret;
// ... session configuration fields
}
Import
import OpenAI from 'openai';
// Access via client.beta.realtime.sessions
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | string |
No | The Realtime model (e.g., 'gpt-4o-realtime-preview') |
| modalities | 'audio'> | No | Modalities the model can respond with |
| voice | string |
No | Voice for audio responses (alloy, ash, ballad, coral, echo, sage, shimmer, verse) |
| instructions | string |
No | Default system instructions prepended to model calls |
| input_audio_format | 'g711_ulaw' | 'g711_alaw' | No | Format of input audio |
| output_audio_format | 'g711_ulaw' | 'g711_alaw' | No | Format of output audio |
| input_audio_transcription | InputAudioTranscription |
No | Transcription configuration (model, language, prompt) |
| input_audio_noise_reduction | InputAudioNoiseReduction |
No | Noise reduction type (near_field or far_field) |
| turn_detection | TurnDetection |
No | Server VAD or Semantic VAD configuration |
| tools | Array<Tool> |
No | Function tools available to the model |
| tool_choice | string |
No | How the model chooses tools (auto, none, required) |
| temperature | number |
No | Sampling temperature (0.6 to 1.2) |
| speed | number |
No | Spoken response speed (0.25 to 1.5) |
| max_response_output_tokens | 'inf' | No | Maximum output tokens per response |
| tracing | TracingConfiguration | No | Tracing configuration for the session |
| client_secret | ClientSecret |
No | Configuration for ephemeral token expiration |
Outputs
| Name | Type | Description |
|---|---|---|
| SessionCreateResponse | SessionCreateResponse |
Session configuration plus ephemeral client_secret with value and expires_at |
| client_secret.value | string |
Ephemeral API key for client-side authentication |
| client_secret.expires_at | number |
Unix timestamp when the ephemeral key expires |
Usage Examples
Basic Usage
import OpenAI from 'openai';
const client = new OpenAI();
// Create a realtime session with ephemeral token
const session = await client.beta.realtime.sessions.create({
model: 'gpt-4o-realtime-preview',
voice: 'alloy',
modalities: ['text', 'audio'],
instructions: 'You are a friendly assistant.',
input_audio_format: 'pcm16',
output_audio_format: 'pcm16',
turn_detection: {
type: 'semantic_vad',
eagerness: 'medium',
},
});
// Use the ephemeral key for client-side WebSocket connection
console.log(session.client_secret.value);
console.log(session.client_secret.expires_at);