Top
Best
New

Posted by Sean-Der 13 hours ago

How OpenAI delivers low-latency voice AI at scale(openai.com)
385 points | 120 commentspage 2
qrush 12 hours ago|
Am I reading this right that OpenAI is not using Livekit for WebRTC/audio anymore?
fidotron 11 hours ago|
It does appear that way. The LiveKit server is not what you would want for this architecture anyway (as they basically say with the SFU discussion), although it does have a lot of useful stuff in the client SDKs.
fuddle 11 hours ago|||
They do link to the Livekit docs in the footnotes: https://docs.livekit.io/transport/self-hosting/kubernetes/
zuzululu 8 hours ago|||
whats wrong with livekit ?
whateveracct 2 hours ago||
why is the "How" included here? it is often removed
hiroakiaizawa 3 hours ago||
Interesting. What are the main latency bottlenecks in practice?
logickkk1 11 hours ago||
IMO this probably isn't just about latency. keeping people in voice gives them training data text never will. is that why they were fine going transceiver over sfu and mostly ignoring multi-party?
hnav 9 hours ago||
RFC 9297 support can't come quick enough in browsers. Would obviate having to deal with WebRTC in a client-server scenario.
charisma123 12 hours ago||
If a transceiver crashes during a stream, how is the active session recovered? Does the system automatically re-establish the context in a new WebRTC session?
Sean-Der 12 hours ago|
It doesn't today, but you could with sometime like this [0]. You can save/suspend all WebRTC state and bring it back with the next process.

[0] https://github.com/pion/webrtc-zero-downtime-restart

furyofantares 12 hours ago||
> Global reach for more than 900 million weekly active users

lol, definitely didn't need to know there's 900M weekly users for this post. I mean yeah, there's a lot of users and they serve globally, that's relevant. But this is just pulling out your biggest stat because you can. How many voice users you have would actually be relevant and interesting but, to baselessly speculate on motivation here, might be a number that doesn't add as much fuel to an upcoming IPO as reminded people that you're almost at a billion users does.

anzerarkin 13 hours ago||
I hate the voice ai though, it's so much dumber
brett-jackson 5 hours ago||
I used to use it all the time until about a year ago or so. Its responses are full of filler and the safeguards are really overbearing. It often will just give wrong answers in a way that GPT-5.x does not. I once asked it why a particular celebrity was canceled and it refused to tell me because it may harm me to know what they said!
NikolaNovak 12 hours ago||
Fwiw - I found the advanced AI voice feature to be actually detrimental. It's good if you just want a single sentence answer. I've turned it off though when I want a more detailed, structured, considered answer.
drusepth 12 hours ago||
Interestingly, that kind of parallels the real world too: if you want a quick and high level answer, talk to someone in person; if you want something detailed and info-dense, get them to write it down.
CrzyLngPwd 12 hours ago|
It's bad enough having to speed-read the waffle of its written answers; even when told to be concise, the thought of having to listen to it waffle on in its smarmy, sycohpantic fashion makes me want to reach for the sick bag.
More comments...