DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

Posted by Alifatisk 7 hours ago

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost(esengine.github.io)

240 points | 135 commentspage 2

storus 2 hours ago|

Can it instruct DeepSeek during an LLM call to start removing old tool calls from the context instead of waiting for the LLM call to finish if the context size approaches DeepSeek's dumb zone? Claude Code can't do that, /compact can only happen after the LLM call; it's often preferable to start cleaning up context during an LLM call, especially when tool calls are huge like reading markdown files; implementation-wise all that is needed is to start removing earliest <tool call start> ... <tool call end> and replacing them just with some log entry stating this tool call was already performed, then re-running KV cache prefill (so the "online" compaction would get 0.5s latency hit every time it's performed). That way one can read 1000 files in one LLM call.

danborn26 3 hours ago||

High caching rates for coding agents can drastically reduce latency and API costs. I am curious to see how the caching strategy handles context invalidation across multiple files.

xcjsam 3 hours ago|

[flagged]

imagetic 3 hours ago||

https://shittycodingagent.ai

peheje 54 minutes ago||

having issues with truncated output from deepseek v4 pro through openrouter via pi-harness on ptyxis-terminal using ubuntu

trying reasonix with direct api..

peheje 53 minutes ago||

first impression: the tui flickers a lot, unpleasent

mi_lk 2 hours ago|||

Not sure about the story but it would be funny if pi folks actually own this domain.

chuckadams 2 hours ago||

They do. That's Pi's old name.

chabes 3 hours ago||

Aka pi.dev

mmaunder 4 hours ago||

Unusable thanks to the top animation pushing the rest of the site down repeatedly as you’re trying to read.

nextaccountic 2 hours ago||

> Tool-call repair

> Tool arguments the model produces occasionally have JSON typos, unclosed quotes, or shape mismatches. Reasonix runs a schema-aware repair pass before dispatch so malformed args still execute.

So Deepseek API doesn't have a structured output option where you give a grammar and the model promises the output will follow this grammar?

Or it does, but it's buggy?

singiamtel 4 hours ago||

I would've liked benchmarks against other harnesses showing the caching performance

Alifatisk 3 hours ago|

Is there benchmarks and measurements that offers comparisons between different harnesses?

mmarcant 1 hour ago||

"byte-stable prefix cache" -- give us your codebase in a way that's even EASIER for us to train on.

hebetude 4 hours ago||

Wow the UI looks exactly what I vibe coded yesterday. What a coincidence

huqedato 3 hours ago|

It's obvious why...

hirako2000 5 hours ago||

Good timing given the cost spike across other frontier models.

notjes 5 hours ago|

Good thing DS just made their discount permanent. https://x.com/deepseek_ai/status/2057854261699195173

theanonymousone 5 hours ago|

Isn't caching a server-side thing? How does the agent affect it, significantly at least?

embedding-shape 5 hours ago|

Say you put the current time down to the second in the system prompt, which is the message that goes in front of the entire conversation, then basically nothing will be cached, every agent turn needs to ingest the entire session over and over. Contrast to not doing that, and the backend can leverage caching all the way up to the latest message, as nothing until then changed.

esperent 5 hours ago|||

Surely other agent CLIs are not dumb enough to invalidate cache on every turn over something so obvious?

chillfox 4 hours ago|||

I don't think any the agents breaks caching on every turn, but they might do things like current list of files, or available tools depending upon plan/build mode... or lots of other things that breaks caching multiple times during a session.

brookst 4 hours ago||||

Probably not that exactly, but there is a tradeoff between effectiveness of the prompt and cache hit rate. If putting the user’s datetime in the middle of the prompt scores higher on evals but worsens cache hits, versus at the end of the prompt where it’s cache friendly but may not be as effective, what do you do?

This is still art as much as science and the different harnesses take different approaches.

embedding-shape 4 hours ago|||

Obviously not, most agents properly keep previous messages unchanged, at least the major ones I've been digging into the source off. Also, everything would get so much slower, that even developers creating their own agents would notice quickly how much slower theirs is, if they fuck this up.

theanonymousone 4 hours ago|||

Yes, of course you can destroy it. But how far can you "improve", beyond decent "common sense" behaviour.

More comments...