February 4th, 2025

Introducing Azure OpenAI Realtime API Support in JavaScript

Deyaaeldeen Almahallawi
Software Engineer

We’re excited to announce the release of Realtime API support in the OpenAI library for JavaScript (v4.81.0), enabling developers to send and receive messages instantly from Azure OpenAI models. In this blog post, we explore how to configure, connect, and utilize this new capability to create highly interactive and responsive applications.


Why Realtime API support matters

Realtime APIs allow you to receive immediate responses from Azure OpenAI models, making them especially valuable for applications where quick feedback is essential. Whether you’re building a speech-to-speech experience, a streaming data processor, or a live monitoring tool, this feature empowers you to deliver an engaging user experience with minimal delay.


Get started

JavaScript has numerous runtimes including Node.js, browsers, and more, each with its own requirements. To cater to these various environments, the JavaScript library provides two clients for Realtime connections:

  1. OpenAIRealtimeWebSocket Uses the native WebSocket web API, commonly supported in browsers and other environments adhering to web standards.
  2. OpenAIRealtimeWS Utilizes the ws library, well-suited for Node.js and similar server-side JavaScript environments.

Before you begin, make sure you have:

  • Node.js installed (if you plan to work in a Node.js runtime)
  • An Azure subscription with access to the Azure OpenAI service

Installation

Use the following command to install the required packages:

npm install openai @azure/identity dotenv

Set up the environment

Create an .env file in the root of your project and add your Azure secrets:

AZURE_OPENAI_ENDPOINT="<The endpoint of the Azure OpenAI resource>"

Code sample

This section provides a step-by-step walkthrough of how to use the Realtime API in the JavaScript library. We break it down so you can easily replicate it in your own environment.

Import modules

Begin by importing the relevant modules:

import { OpenAIRealtimeWS } from 'openai/beta/realtime/websocket';
import { AzureOpenAI } from 'openai';
import { DefaultAzureCredential, getBearerTokenProvider } from '@azure/identity';
import 'dotenv/config';

Configure credentials

You need proper credentials to authenticate with the Azure OpenAI service. We use DefaultAzureCredential, which streamlines the process by automatically selecting the appropriate credential type based on your environment:

const cred = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const azureADTokenProvider = getBearerTokenProvider(cred, scope);

Create the client

Next, initialize the Azure OpenAI client with your desired deployment name and API version:

const deploymentName = 'gpt-4o-realtime-preview-1001';
const client = new AzureOpenAI({
  azureADTokenProvider,
  apiVersion: '2024-10-01-preview',
  deployment: deploymentName,
});

Establish the WebSocket connection

Use the client to create a WebSocket connection. In a browser environment, you would typically use OpenAIRealtimeWebSocket.azure(). For a Node.js environment with the ws library, you can use OpenAIRealtimeWS.azure(). Here’s the Node.js example:

const rt = await OpenAIRealtimeWebSocket.azure(client);

Handle events

Event handlers allow you to orchestrate how your application responds to various stages of the real-time interaction life cycle, including connection establishment, message exchange, and error handling. A detailed explanation of how to implement and manage these events follows next.


1. Listen for the open event

When the WebSocket connection is successfully established by the server, the open event is triggered. At this point, you can begin sending messages and commands to the Azure OpenAI model immediately. In this example, we’re updating the session parameters and initiating a text conversation with the model.

rt.socket.on('open', () => {
  console.log('Connection opened!');

  rt.send({
    type: 'session.update',
    session: {
      modalities: ['text'],
      model: 'gpt-4o-realtime-preview',
    },
  });

  rt.send({
    type: 'conversation.item.create',
    item: {
      type: 'message',
      role: 'user',
      content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
    },
  });

  // Signal that we're ready to receive a response from the model
  rt.send({ type: 'response.create' });
});

In this snippet:

  • session.update informs the service about any configuration changes (for example, chosen model, input modalities).
  • conversation.item.create sends a user prompt to the model.
  • response.create indicates you want the model to begin generating a response immediately.
2. Subscribe to session and response events

After initializing the session and sending conversation items, you’ll want to capture the model’s responses. The JavaScript library provides event listeners for these activities:

rt.on('session.created', (event) => {
  console.log('session created!', event.session);
  console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.on('close', () => console.log('\nConnection closed!'));
  • session.created indicates that the session is successfully set up on the server.
  • response.text.delta streams partial text output as it is generated, allowing you to handle or display responses in real-time.
  • response.text.done fires when the text generation process for that particular response completes.
  • response.done signals that the entire response cycle is finished. Here, we close the WebSocket connection as a simple example, though you may choose to keep it open for further interactions.
  • close is an event on the underlying WebSocket (rt.socket.on('close')), telling you that the connection is deliberately terminated or unexpectedly closed.
3. Handle errors

In any network or service interaction, errors may occur. Ensuring that your application logs and handles these errors is crucial for stability and a smooth user experience:

rt.on('error', (err) => {
// Log the error or handle it based on your application needs
  console.error('An error occurred:', err);
});

Conclusion

The introduction of Realtime API support in the OpenAI library for JavaScript provides developers with a powerful new way to create interactive, low-latency applications. With these capabilities, you can deliver enriched user experiences—be it live chatbots, streaming analytics, or real-time data processing tools. We hope this detailed guide helps you get started with building and experimenting in your own environment.

Stay tuned for future updates and enhancements to the library, and feel free to share your innovative uses of the Realtime API in the comments!

Next steps

To further expand your Realtime integration with Azure OpenAI, explore the following resources for more guidance and practical examples:

Author

Deyaaeldeen Almahallawi
Software Engineer

Deyaa is a software engineer working on Azure SDKs ranging from Event Hubs to Schema Registry to Text Analytics. He is passionate about library design and developer tools.

0 comments