Show HN: LemonSlice – Upgrade your voice agents to real-time video

Posted by lcolucci 1/27/2026

Hey HN, we're the co-founders of LemonSlice (try our HN playground here: https://lemonslice.com/hn). We train interactive avatar video models. Our API lets you upload a photo and immediately jump into a FaceTime-style call with that character. Here's a demo: https://www.loom.com/share/941577113141418e80d2834c83a5a0a9

Chatbots are everywhere and voice AI has taken off, but we believe video avatars will be the most common form factor for conversational AI. Most people would rather watch something than read it. The problem is that generating video in real-time is hard, and overcoming the uncanny valley is even harder.

We haven’t broken the uncanny valley yet. Nobody has. But we’re getting close and our photorealistic avatars are currently best-in-class (judge for yourself: https://lemonslice.com/try/taylor). Plus, we're the only avatar model that can do animals and heavily stylized cartoons. Try it: https://lemonslice.com/try/alien. Warning! Talking to this little guy may improve your mood.

Today we're releasing our new model* - Lemon Slice 2, a 20B-parameter diffusion transformer that generates infinite-length video at 20fps on a single GPU - and opening up our API.

How did we get a video diffusion model to run in real-time? There was no single trick, just a lot of them stacked together. The first big change was making our model causal. Standard video diffusion models are bidirectional (they look at frames both before and after the current one), which means you can't stream.

From there it was about fitting everything on one GPU. We switched from full to sliding window attention, which killed our memory bottleneck. We distilled from 40 denoising steps down to just a few - quality degraded less than we feared, especially after using GAN-based distillation (though tuning that adversarial loss to avoid mode collapse was its own adventure).

And the rest was inference work: modifying RoPE from complex to real (this one was cool!), precision tuning, fusing kernels, a special rolling KV cache, lots of other caching, and more. We kept shaving off milliseconds wherever we could and eventually got to real-time.

We set up a guest playground for HN so you can create and talk to characters without logging in: https://lemonslice.com/hn. For those who want to build with our API (we have a new LiveKit integration that we’re pumped about!), grab a coupon code in the HN playground for your first Pro month free ($100 value). See the docs: https://lemonslice.com/docs. Pricing is usage-based at $0.12-0.20/min for video generation.

Looking forward to your feedback!

EDIT: Tell us what characters you want to see in the comments and we can make them for you to talk to (e.g. Max Headroom)

*We did a Show HN last year for our V1 model: https://news.ycombinator.com/item?id=43785044. It was technically impressive but so bad compared to what we have today.

133 points | 132 commentspage 3

pbhjpbhj 1/28/2026|

Sounds like an innovative approach, any IP protection on your tech?

Have your early versions made any sort of profit?

Absolutely amazing stuff to me. A teenager I very briefly showed it to was nonplussed - 'it's a talking head, isn't that really easy to do' ...

andrew-w 1/28/2026|

Haha, I kind of get that reaction. Convincing the world "this was hard to do" is generally not easy. Re: user uploads, we're operating in good faith at the moment (no built-in IP moderation). This hasn't been an issue so far. Current pricing reflects our operating costs. Each end-user gets a dedicated GPU for the duration of a call, which is expensive. Advancements on the model-side should eventually allow us to parallelize this.

bennyp101 1/27/2026||

Heads up, your privacy policy[0] does not work in dark mode - I was going to comment saying it made no sense, then I highlighted the page and more text appeared :)

[0] https://lemonslice.com/privacy

sid-the-kid 1/27/2026||

Fix deployed! This is why it's good to launch on hacker news. thanks for the tip.

bennyp101 1/27/2026||

Nice one - thanks :)

sid-the-kid 1/27/2026||

Good catch! Working on a fix now.

FatalLogic 1/28/2026||

Your demo video defaults to play at 1.5x speed

You probably didn't intend to do that

lcolucci 1/28/2026|

whoops I actually did set that on purpose. I guess I like watching things sped up and assumed others did too :) But you can change it.

mdrzn 1/28/2026||

"we're releasing our new model" is it downloadable and runnable in local? Could I create a "vTuber" persona with this model?

andrew-w 1/28/2026|

We have not released the weights, but it is fully available to use in your websites or applications. I can see how our wording there could be misconstrued -- sorry about that. You can absolutely create a vTuber persona. The link in the post is still live if you want to create one (as simple as uploading an image, selecting a voice, and defining the personality). We even have a prebuilt UI you can embed in a website, just like a youtube video.

beast200 1/30/2026||

That's really impressive!

lcolucci 7 days ago|

Thank you!

slake 1/29/2026||

That's amazing. Feels like a major step ahead. No lag, very snappy. Outstanding work.

Feels like those sci-fi shows where you can talk to Hari Seldon even though he lived like a 100 years ago.

My prediction, this will become really, really big.

korneelf1 1/27/2026||

Wow this is really cool, haven't seen real-time video generation that is this impressive yet!

lcolucci 1/27/2026|

Thank you so much! It's been a lot of fun to build

r0fl 1/27/2026||

Where’s the hn playground to grab a free month?

I have so many websites that would do well with this!

dang 1/27/2026||

(We've replaced the link to their homepage (https://lemonslice.com/) with the HN playground at the start of the text above)

lcolucci 1/27/2026||

Thanks Dan! The HN playground let's anyone try out for free without login

lcolucci 1/27/2026||

https://lemonslice.com/hn - There's a button for "Get 1st month free" in the Developer Quickstart

buddycorp 1/27/2026||

I'm curious if I can plug in my own OpenAI realtime voice agents into this.

lcolucci 1/27/2026||

Good question! Yes and to do this you'd want to use our "Self-Managed Pipeline": https://lemonslice.com/docs/self-managed/overview. You can combine any TTS, LLM and STT combination with LemonSlice as the avatar layer.

jfaat 1/27/2026|||

I'm using an openAI realtime voice with livekit, and they said they have a livekit integration so it would probably be doable that way. I haven't used video in livekit though and I don't know how the plugins are setup for it

lcolucci 1/27/2026||

Yes this is exactly right. Using the LiveKit integration you can add LemonSlice as an avatar layer on top of any voice provider

tmshapland 1/27/2026||

Here's the link to the LiveKit LemonSlice plugin. It's very easy to get started. https://docs.livekit.io/agents/models/avatar/plugins/lemonsl...

sid-the-kid 1/27/2026||

Good question. When using the API, you can bring any voice agent (or LLM). Our API takes in what the agent will say, and then streams back the video of the agent saying it.

For the fully hosted version, we are currently partnered with ElevenLabs.

koakuma-chan 1/27/2026|

> You're probably thinking, how is this useful

I was thinking why the quality is so poor.

sid-the-kid 1/27/2026|

curious what avatar you think is poor quality? Or, what you think is poor quality. i want to know :)

koakuma-chan 1/27/2026||

Low res and low fps. Not sure if lipsync is poor, or if low fps makes it look poor. Voice sounds low quality, as if recorded on a bad mic, and doesn't feel like it matches the avatar.

sid-the-kid 1/27/2026||

thanks for the feedback. that's helpful. Ya, some avatars have worse lip synch than others. It depends a little on how zoomed in you are.

I am double checking now to make 100% sure we return the original audio (and not the encoded/decoded audio).

We are working on high-res.

koakuma-chan 1/28/2026||

Good luck.

More comments...