OpenAI Upgrades APIs for Real-Time and Voice AI

OpenAI has released several upgrades to its API, including a faster GPT Realtime 1.5 model to reduce latency in text-based agents. According to a podcast report, the update also includes WebSockets support that speeds up multi-tool AI agents by 20-40% and an audio model with 10% better transcription accuracy, enhancing capabilities for voice-enabled commerce.

- The new `gpt-realtime-1.5` audio model improves alphanumeric transcription accuracy by over 10%, boosts performance on audio-based reasoning tasks by 5%, and follows instructions 7% more accurately, according to OpenAI's internal tests. - WebSocket support creates a persistent connection to the API, eliminating the need to re-send the entire conversation history with each turn. This method is most effective for complex agentic workflows that involve 20 or more tool calls, reducing end-to-end latency in those scenarios by up to 40%. - The underlying GPT-4o architecture, on which the real-time models are built, has an average voice latency of 320 milliseconds, a significant reduction from GPT-4's 5.4 seconds, making AI conversations feel closer to human response times. - While the model is faster, pricing for gpt-realtime-1.5 remains the same as the previous version. The cost for audio processing is $32 per million input tokens and $64 per million output tokens, which is eight times higher than the text input pricing ($4 per million tokens). - The improved transcription accuracy is particularly beneficial for voice-enabled commerce, where understanding addresses, product codes, or payment details is critical. Early adopters have reported that the upgrades cut phone call errors in half and significantly improved the system's ability to handle user interruptions. - The WebSocket connection maintains state in a temporary, in-memory cache, which allows for faster follow-up responses. However, this state is volatile; if the connection drops, the context is lost unless developers build their own logic to reconstruct the conversation.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.