Posted by fittingopposite 7 hours ago
It doesn't change the fact that the most important thing is verification/validation of their output either from tools, developer reviewing/making decisions. But even if don't want that approach, diffusion models are just a lot more efficient it seems. I'm interested to see if they are just a better match common developer tasks to assist with validation/verification systems, not just writing (likely wrong) code faster.
And in some sense, all of my claude code usage feels tok/s bottlenecked. There's never really a time where I'm glad to wait for the tokens, I'd always prefer faster.
Hell, want to do syntax highlighting? Just throw buffer text into an ultra-fast LLM.
It's easy to overlook how many small day-to-day heuristic schemes can be replaced with AI. It's almost embarrassing to think about all the totally mundane uses to which we can put fast, modest intelligence.
> no reasoning comparison
Benchmarks against reasoning models:
https://www.inceptionlabs.ai/blog/introducing-mercury-2
> no demo
https://chat.inceptionlabs.ai/
> no info on numbers of parameters for the model
This is a closed model. Do other providers publish the number of parameters for their models?
> testimonials that don't actually read like something used in production
Fair point.
Mercury v1 focused on autocomplete and next-edit prediction. Mercury 2 extends that into reasoning and agent-style workflows, and we have editor integrations available (docs linked from the blog). I’d encourage folks to try the models!
That being said, the chain is pretty basic. It's possible that they don't disclose the full follow-up prompt list.
> We optimize for speed users actually feel: responsiveness in the moments users experience — p95 latency under high concurrency, consistent turn-to-turn behavior, and stable throughput when systems get busy.
> The car wash is 100m away from my house. Should I walk or drive?
I didn't find a way to share the conversation, so here it is the full response in `medium` settings:
---
Walking 100 m is a very short distance—roughly a minute or two at a normal pace. In most situations, walking will be quicker overall because you avoid the time needed to get into the car, start it, and find parking. It also saves fuel and is better for the environment.
*Consider these factors before deciding:*
| Factor | Walking | Driving | |--------|---------|---------| | *Time* | ~1–2 min (plus a few seconds to step out) | ~1 min driving, plus ~1–2 min to get in/out and park | | *Energy/Fuel* | None | Uses a small amount of fuel | | *Weather* | Comfortable if dry & mild | Better if it’s raining, very hot, or icy | | *Physical ability* | Easy for most people | Not needed for such a short trip | | *Car availability* | Not required | Must have a car nearby and be willing to use it | | *Environmental impact* | Zero emissions | Small emissions |
*Recommendation:* If the weather is pleasant and you don’t need the car for anything else right away, walking is the simplest, fastest, and most eco‑friendly choice. Drive only if you’re dealing with inclement weather, have heavy items to carry, or need the car immediately after the wash.
Do you have any specific constraints (e.g., rain, heavy bags, time pressure) that might affect the decision?
Is the goal behind evaluating models this way to incentivize training them to assume we're bad-faith tricksters even when asking benign questions like how best to traverse a particular 100m? I can't imagine why it would be desirable to optimize for that outcome.
(I'm not saying that's your goal personally - I mean the goal behind the test itself, which I'd heard of before this thread. Seems like a bad test.)
> Walking 100 m is generally faster, cheaper, and better for the environment than driving such a short distance. If you have a car that’s already running and you don’t mind a few extra seconds, walking also avoids the hassle of finding parking or worrying about traffic.