Speech to Speech Qwen3-Omni visualization tool

Hey everyone!

We at Hathora have recently released our ultra-low latency deployment of Qwen/Qwen3-Omni-30B-A3B-Instruct, one of the leading open source speech-to-speech-capable models.

Platform release:

https://models.hathora.dev/model/qwen3-omni

With the release, it got us thinking, what if we built a visualization tool to see what actually happens when you record audio and get a audio-response back? With that, we introduce our visualization website, that gives a high level overview of the individual pieces present in making speech-to-speech inference possible in qwen3-omni.

Feel free to give it a try, give us any feedback, and give the platform a try!