Kimi K2.5 Technical Report [pdf]

Posted by vinhnx 7 days ago

Kimi K2.5 Technical Report [pdf](github.com)

387 points | 141 commentspage 2

derac 7 days ago|

I really like the agent swarm thing, is it possible to use that functionality with OpenCode or is that a Kimi CLI specific thing? Does the agent need to be aware of the capability?

zeroxfe 7 days ago||

It seems to work with OpenCode, but I can't tell exactly what's going on -- I was super impressed when OpenCode presented me with a UI to switch the view between different sub-agents. I don't know if OpenCode is aware of the capability, or the model is really good at telling the harness how to spawn sub-agents or execute parallel tool calls.

esafak 7 days ago||

Has anyone tried it and decided it's worth the cost; I've heard it's even more profligate with tokens?

swyx 7 days ago||

Yes. https://x.com/swyx/status/2016381014483075561?s=20 it's not crazy, they cap it to 3 credits, and also YSK agent swarm is a closed source product

Would i use it a gain compared to Deep Research products elsewhere? Maybe, probably not but only bc it's hard to switch apps

epolanski 7 days ago||

It's interesting to note that a model that can OpenAI is valued almost 400 times more than moonshotai, despite their models being surprisingly close.

famouswaffles 7 days ago||

OpenAI is a household name with nearly a billion weekly active users. Not sure there's any reality where they wouldn't be valued much more than Kimi regardless of how close the models may be.

m3kw9 7 days ago|||

Unless they can beat their capabilities by a clear magical step up and has infrastructure to capture the users

moffkalast 7 days ago||

Well to be the devil's advocate: One is a household name that holds most of the world's silicon wafers for ransom, and the other sounds like a crypto scam. Also estimating valuation of Chinese companies is sort of nonsense when they're all effectively state owned.

epolanski 7 days ago||

There isn't a single % that is state owned in Moonshot AI.

And don't start me with the "yeah but if the PRC" because it's gross when US can de facto ban and impose conditions even on European companies, let alone the control it has on US ones.

moffkalast 7 days ago|||

I'm not sure if that is accurate, most of the funding they've got is from Tencent and Alibaba, and we know what happened to Jack Ma the second he went against the party line. These two are defacto state owned enterprises. Moonshot is unlikely to be for sale in any meaningful way so its valuation is moot.

[0] https://en.wikipedia.org/wiki/Moonshot_AI#Funding_and_invest...

swyx 7 days ago|||

Funny because that's how us Americans feel about your European cookie banner litter and unilateral demands on privacy

margorczynski 7 days ago||

I wonder how K2.5 + OpenCode compares to Opus with CC. If it is close I would let go of my subscription, as probably a lot of people.

eknkc 7 days ago||

It is not opus. It is good, works really fast and suprisingly through about its decisions. However I've seen it hallucinate things.

Just today I asked for a code review and it flagged a method that can be `static`. The problem is it was already static. That kind of stuff never happens with Opus 4.5 as far as I can tell.

Also, in an opencode Plan mode (read only). It generated a plan and instead of presenting it and stopping, decided to implement it. Could not use the edit and write tools because the harness was in read only mode. But it had bash and started using bash to edit stuff. Wouldn't just fucking stop even though the error messages it received from opencode stated why. Its plan and the resulting code was ok so I let it go crazy though...

esafak 7 days ago|||

Some models have a mind of their own. I keep them on a leash with `permission` blocks in OC -- especially for rm/mv/git.

jauntywundrkind 7 days ago|||

I've been drafting plans/specs in parallel with Opus and Kimi. Then asking them to review the others plan.

I still find Opus is "sharper" technically, tackles problems more completely & gets the nuance.

But man Kimi k2.5 can write. Even if I don't have a big problem description, just a bunch of specs, Kimi is there, writing good intro material, having good text that more than elaborates, that actually explains. Opus, GLM-4.7 have both complemented Kimi on it's writing.

Still mainly using my z.ai glm-4.7 subscription for the work, so I don't know how capable it really is. But I do tend to go for some Opus in sticky spots, and especially given the 9x price difference, I should try some Kimi. I wish I was set up for better parallel evaluation; feels like such a pain to get started.

naragon 7 days ago|||

I've been using K2.5 with OpenCode to do code assessments/fixes and Opus 4.5 with CC to check the work, and so far so good. Very impressed with it so far, but I don't feel comfortable canceling my Claude subscription just yet. Haven't tried it on large feature implementations.

ithkuil 7 days ago||

I also wonder if CC can be used with k2.5 with the appropriate API adapter

tjuene 7 days ago||

yes, just use the base url https://api.moonshot.ai/anthropic

(https://platform.moonshot.ai/docs/guide/agent-support#config...)

miroljub 7 days ago||

I've been quite satisfied lately with MiniMax M-2.1 in opencode.

How does Kimi 2.5 compare to it in real world scenarios?

viraptor 7 days ago|

A lot better in my experience. M2.1 to me feels between haiku and sonnet. K2.5 feels close to opus. That's based on my testing of removing some code and getting it to reimplement based on tests. Also the design/spec writing feels great. You can still test k2.5 for free in OpenCode today.

miroljub 7 days ago||

Well, Minimax was the equivalent of Sonnet in my testing. If Kimi approach Opus, that would be great.

samtheprogram 7 days ago||

Kimi K2.5 approaches Sonnet as well from what I can tell, it's just slower to get to the result.

throwaway12345t 6 days ago||

Is there a reasonable place to run the unquantized version of this for less than Claude or OpenAI?

It seems to be priced the same and if it’s being hosted somewhere vs run locally it’s still a worse model, the only advantage would be it is not Anthropic or OpenAI.

oxqbldpxo 7 days ago||

This Kimi K2 is so far the best. Gemini is also great, but google is stock in the academic bias of Stanford and MIT and can't think outside the box. China definitely ahead on Ai. Wish somehow someone here in the US, would think different.

dfsegoat 7 days ago|

> but google is stock in the academic bias of Stanford and MIT and can't think outside the box

Can you clarify what you mean? I am not sure I follow.

JSR_FDED 7 days ago||

s/stock/stuck/

niyikiza 6 days ago||

The Agent Swarm section is fascinating. I'm working on authorization for multi-agent systems so this is relevant to my interests. Lots of interesting parallels to capability-based security models.

cmrdporcupine 7 days ago||

DeepSeek is likely to release a new model soon, and judging from the past it's likely to be more cost effective and just as or more powerful than Kimi 2.5.

DeepSeek 3.2 was already quite compelling. I expect its successor will be competitive.

tallesborges92 7 days ago||

I’ve added the api key support to kimi on my agentic coding: https://github.com/tallesborges/zdx

man4 6 days ago|

[dead]

sreekanth850 7 days ago|

Calude give 100% passmark for code generated by kimi and sometimes it say, its better than what claude proposed. Absolutely best os model.

More comments...