{"id":3297,"date":"2025-02-04T21:35:07","date_gmt":"2025-02-05T05:35:07","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sdk\/?p=3297"},"modified":"2025-02-05T07:59:18","modified_gmt":"2025-02-05T15:59:18","slug":"introducing-azure-openai-realtime-api-support-in-javascript","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sdk\/introducing-azure-openai-realtime-api-support-in-javascript\/","title":{"rendered":"Introducing Azure OpenAI Realtime API Support in JavaScript"},"content":{"rendered":"<p>We&#8217;re excited to announce the release of Realtime API support in the <a href=\"https:\/\/www.npmjs.com\/package\/openai\">OpenAI library for JavaScript<\/a> (v4.81.0), enabling developers to send and receive messages instantly from Azure OpenAI models. In this blog post, we explore how to configure, connect, and utilize this new capability to create highly interactive and responsive applications.<\/p>\n<hr \/>\n<h2>Why Realtime API support matters<\/h2>\n<p>Realtime APIs allow you to receive immediate responses from Azure OpenAI models, making them especially valuable for applications where quick feedback is essential. Whether you&#8217;re building a speech-to-speech experience, a streaming data processor, or a live monitoring tool, this feature empowers you to deliver an engaging user experience with minimal delay.<\/p>\n<hr \/>\n<h2>Get started<\/h2>\n<p>JavaScript has numerous runtimes including Node.js, browsers, and more, each with its own requirements. To cater to these various environments, the JavaScript library provides two clients for Realtime connections:<\/p>\n<ol>\n<li><strong>OpenAIRealtimeWebSocket<\/strong>\nUses the native WebSocket web API, commonly supported in browsers and other environments adhering to web standards.<\/li>\n<li><strong>OpenAIRealtimeWS<\/strong>\nUtilizes the <code>ws<\/code> library, well-suited for Node.js and similar server-side JavaScript environments.<\/li>\n<\/ol>\n<p>Before you begin, make sure you have:<\/p>\n<ul>\n<li><strong>Node.js<\/strong> installed (if you plan to work in a Node.js runtime)<\/li>\n<li>An <strong>Azure subscription<\/strong> with access to the Azure OpenAI service<\/li>\n<\/ul>\n<h3>Installation<\/h3>\n<p>Use the following command to install the required packages:<\/p>\n<pre><code class=\"language-bash\">npm install openai @azure\/identity dotenv<\/code><\/pre>\n<h3>Set up the environment<\/h3>\n<p>Create an .env file in the root of your project and add your Azure secrets:<\/p>\n<pre><code>AZURE_OPENAI_ENDPOINT=\"&lt;The endpoint of the Azure OpenAI resource&gt;\"<\/code><\/pre>\n<h3>Code sample<\/h3>\n<p>This section provides a step-by-step walkthrough of how to use the Realtime API in the JavaScript library. We break it down so you can easily replicate it in your own environment.<\/p>\n<h4>Import modules<\/h4>\n<p>Begin by importing the relevant modules:<\/p>\n<pre><code class=\"language-js\">import { OpenAIRealtimeWS } from 'openai\/beta\/realtime\/websocket';\r\nimport { AzureOpenAI } from 'openai';\r\nimport { DefaultAzureCredential, getBearerTokenProvider } from '@azure\/identity';\r\nimport 'dotenv\/config';<\/code><\/pre>\n<h4>Configure credentials<\/h4>\n<p>You need proper credentials to authenticate with the Azure OpenAI service. We use <code>DefaultAzureCredential<\/code>, which streamlines the process by automatically selecting the appropriate credential type based on your environment:<\/p>\n<pre><code class=\"language-js\">const cred = new DefaultAzureCredential();\r\nconst scope = 'https:\/\/cognitiveservices.azure.com\/.default';\r\nconst azureADTokenProvider = getBearerTokenProvider(cred, scope);<\/code><\/pre>\n<h4>Create the client<\/h4>\n<p>Next, initialize the Azure OpenAI client with your desired deployment name and API version:<\/p>\n<pre><code class=\"language-js\">const deploymentName = 'gpt-4o-realtime-preview-1001';\r\nconst client = new AzureOpenAI({\r\n  azureADTokenProvider,\r\n  apiVersion: '2024-10-01-preview',\r\n  deployment: deploymentName,\r\n});<\/code><\/pre>\n<h4>Establish the WebSocket connection<\/h4>\n<p>Use the client to create a WebSocket connection. In a browser environment, you would typically use <code>OpenAIRealtimeWebSocket.azure()<\/code>. For a Node.js environment with the <code>ws<\/code> library, you can use <code>OpenAIRealtimeWS.azure()<\/code>. Here&#8217;s the Node.js example:<\/p>\n<pre><code class=\"language-js\">const rt = await OpenAIRealtimeWS.azure(client);<\/code><\/pre>\n<h4>Handle events<\/h4>\n<p>Event handlers allow you to orchestrate how your application responds to various stages of the real-time interaction life cycle, including connection establishment, message exchange, and error handling. A detailed explanation of how to implement and manage these events follows next.<\/p>\n<hr \/>\n<h5>1. Listen for the <code>open<\/code> event<\/h5>\n<p>When the WebSocket connection is successfully established by the server, the <code>open<\/code> event is triggered. At this point, you can begin sending messages and commands to the Azure OpenAI model immediately. In this example, we&#8217;re updating the session parameters and initiating a text conversation with the model.<\/p>\n<pre><code class=\"language-js\">rt.socket.on('open', () =&gt; {\r\n  console.log('Connection opened!');\r\n\r\n  rt.send({\r\n    type: 'session.update',\r\n    session: {\r\n      modalities: ['text'],\r\n      model: 'gpt-4o-realtime-preview',\r\n    },\r\n  });\r\n\r\n  rt.send({\r\n    type: 'conversation.item.create',\r\n    item: {\r\n      type: 'message',\r\n      role: 'user',\r\n      content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],\r\n    },\r\n  });\r\n\r\n  \/\/ Signal that we're ready to receive a response from the model\r\n  rt.send({ type: 'response.create' });\r\n});<\/code><\/pre>\n<p>In this snippet:<\/p>\n<ul>\n<li><code>session.update<\/code> informs the service about any configuration changes (for example, chosen model, input modalities).<\/li>\n<li><code>conversation.item.create<\/code> sends a user prompt to the model.<\/li>\n<li><code>response.create<\/code> indicates you want the model to begin generating a response immediately.<\/li>\n<\/ul>\n<h5>2. Subscribe to session and response events<\/h5>\n<p>After initializing the session and sending conversation items, you&#8217;ll want to capture the model&#8217;s responses. The JavaScript library provides event listeners for these activities:<\/p>\n<pre><code class=\"language-js\">rt.on('session.created', (event) =&gt; {\r\n  console.log('session created!', event.session);\r\n  console.log();\r\n});\r\n\r\nrt.on('response.text.delta', (event) =&gt; process.stdout.write(event.delta));\r\nrt.on('response.text.done', () =&gt; console.log());\r\n\r\nrt.on('response.done', () =&gt; rt.close());\r\n\r\nrt.socket.on('close', () =&gt; console.log('\\nConnection closed!'));<\/code><\/pre>\n<ul>\n<li><code>session.created<\/code> indicates that the session is successfully set up on the server.<\/li>\n<li><code>response.text.delta<\/code> streams partial text output as it is generated, allowing you to handle or display responses in real-time.<\/li>\n<li><code>response.text.done<\/code> fires when the text generation process for that particular response completes.<\/li>\n<li><code>response.done<\/code> signals that the entire response cycle is finished. Here, we close the WebSocket connection as a simple example, though you may choose to keep it open for further interactions.<\/li>\n<li><code>close<\/code> is an event on the underlying WebSocket (<code>rt.socket.on('close')<\/code>), telling you that the connection is deliberately terminated or unexpectedly closed.<\/li>\n<\/ul>\n<h5>3. Handle errors<\/h5>\n<p>In any network or service interaction, errors may occur. Ensuring that your application logs and handles these errors is crucial for stability and a smooth user experience:<\/p>\n<pre><code class=\"language-js\">rt.on('error', (err) =&gt; {\r\n\/\/ Log the error or handle it based on your application needs\r\n  console.error('An error occurred:', err);\r\n});<\/code><\/pre>\n<h2>Conclusion<\/h2>\n<p>The introduction of Realtime API support in the OpenAI library for JavaScript provides developers with a powerful new way to create interactive, low-latency applications. With these capabilities, you can deliver enriched user experiences\u2014be it live chatbots, streaming analytics, or real-time data processing tools. We hope this detailed guide helps you get started with building and experimenting in your own environment.<\/p>\n<p>Stay tuned for future updates and enhancements to the library, and feel free to share your innovative uses of the Realtime API in the comments!<\/p>\n<h2>Next steps<\/h2>\n<p>To further expand your Realtime integration with Azure OpenAI, explore the following resources for more guidance and practical examples:<\/p>\n<ul>\n<li><strong>Samples<\/strong>\nGet hands-on experience by reviewing sample projects in the official OpenAI Node.js repository:\n<a href=\"https:\/\/github.com\/openai\/openai-node\/tree\/master\/examples\/azure\/realtime\">https:\/\/github.com\/openai\/openai-node\/tree\/master\/examples\/azure\/realtime<\/a><\/li>\n<li><strong>Use with Audio<\/strong>\nLearn how to incorporate Realtime audio capabilities with Azure OpenAI through streaming input and output audio data:\n<a href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/how-to\/realtime-audio\">https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/how-to\/realtime-audio<\/a><\/li>\n<li><strong>API Reference<\/strong>\nConsult the official documentation for detailed information about all available endpoints, parameters, and data structures:\n<a href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/realtime-audio-reference\">https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/realtime-audio-reference<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introducing the new Realtime API support in the OpenAI JavaScript library, enabling developers to create highly interactive and responsive applications by instantly sending and receiving messages from Azure OpenAI models.<\/p>\n","protected":false},"author":32500,"featured_media":3299,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[934,750,159,923,931,733],"class_list":["post-3297","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure-sdk","tag-azure-openai","tag-azure-sdk","tag-javascript","tag-js","tag-openai","tag-typescript"],"acf":[],"blog_post_summary":"<p>Introducing the new Realtime API support in the OpenAI JavaScript library, enabling developers to create highly interactive and responsive applications by instantly sending and receiving messages from Azure OpenAI models.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/posts\/3297","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/users\/32500"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/comments?post=3297"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/posts\/3297\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/media\/3299"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/media?parent=3297"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/categories?post=3297"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sdk\/wp-json\/wp\/v2\/tags?post=3297"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}