MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

Posted by gainsurier 7 hours ago

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second(mimo.xiaomi.com)

418 points | 289 commentspage 4

holoduke 6 hours ago|

Speed is indeed a next big thing what should happen with LLM frontier models. The possibilities with current models but 1000 times faster would be super useful. Earlier this week it took Claude at least full time a week with two max subscriptions to solve a complex issue where we wanted to mimic a occlusion mapping variant used in the game Crimson Desert. Pretty complex mathematical challenge. With a ultra fast LLM and a proper self verification process it would be awesome.

MaxikCZ 2 hours ago||

Id also be interested in more details as sibling comment. I find that when I try to build stuff, its like building skyscraper from straw. What methods are moving you forward the most?

astlouis44 3 hours ago||

Interesting. For your occlusion mapping variant, what engine is the game you're making with made with that you're implementing this for? Do you have Claude hooked up to Unity or Unreal?

trilogic 6 hours ago||

Pfff time wasting. 1 password between 8-16 characters, and this and that... What??? 2 Captcha after captcha, come on 3 Service unavailable This service is not available in your region yet.

Are you kidding me. Come back when you are ready for the users. I was hopping to try it, what a frustration.

qsera 6 hours ago||

Tokens per seconds is the "Megapixels" of AI marketing!

Octoth0rpe 6 hours ago|

I mean, sure, in the sense that they're a real and meaningful number for most of the spectrum on offer, and only gets silly when the number gets too high? There's a pretty big usability difference between 10t/s and 100t/s, and I can imagine similarly for 100->1000. I don't know about > 1000, but let's not pretend that the number is meaningless.

aburayhanalif 4 hours ago||

it is good i think

desireco42 5 hours ago||

I didn't use their pro speed but regular Mimo-v2.5, not even pro, it seems really fast. I have plenty of tokens and subscriptions but this is really impressive. I really don't need another one, but I am tempted simple because it works so fast, can't imagine how this fast service can be.

GaggiX 6 hours ago||

If MiMo v2.5 Pro can run at >1000tk/s on GPUs then I will soon expect the same from OpenAI/Anthropic/Google.

slopinthebag 6 hours ago||

I hope this is the next frontier AI labs push. Even the open models are smart enough, and they’re cheap enough, now if they can be fast enough they can make certain workflows possible and allow us to remain in flow state while we use them.

m00dy 6 hours ago||

boom!

HerShin5 2 hours ago||

[dead]

aplomb1026 5 hours ago|

[flagged]

More comments...