Top
Best
New

Posted by amarble 1 day ago

There is minimal downside to switching to open models(www.marble.onl)
377 points | 299 commentspage 4
blindriver 19 hours ago|
As someone that has pretty powerful desktop that I've been using with local open weight models, people are far exaggerating the quality of them. Some of them are now useful. They don't compare yet to the online models of ChatGPT, Claude, Gemini, etc. They are still about 18 months behind. I have accomplished useful work with them, like image classification on Gemma4, but they are much much slower, much much more expensive and they don't scale at all.

A $10,000 RTX 6000 Blackwell card will pay for 500 months of Claude or Codex, which is 40 years worth of compute. Obviously they are going to raise their prices, my prediction being to $200-500/month, but that still makes them at least years of compute and they scale very well with more traffic. Single GPUs do not, they are pegged at 100% and good luck getting it to answer multiple queries at the same time.

causality0 20 hours ago||
I know open models have gotten quite good in many tasks such as coding or composition, but are there any that can access the internet and retrieve data like ChatGPT, Claude, etc can?

I do have to admit I have recently begun wishing I could pay five dollars a month for a "just answer the fucking question" plan that would give me results without the guardrails and without the constant simpering and ego-stroking. I keep finding myself going a quick evaluation of "is it faster for me to skim search results myself or to construct an elaborate narrative to make an AI give me a real answer".

sleepyeldrazi 19 hours ago||
That's why I like qwen3.6 27B, it has 0 ego, it knows that it doesn't have complete world knowledge, so when it sees a web_search tool it searches all the time. Even qwen3.5 9B is mostly search-eager (but given the size, it's weaker on reasoning on the results if that's needed). I use a stock pi harness with only web_search and web_fetch (cleans up the html to only keep text) tools defined.

I have given up on making Opus actually retrieve online information for me. At this point I only query it side by side with qwen to laugh at how it didn't even attempt to search properly, and how a small local model is beating it every time. Gemini is very fast for searching, but somehow miss-sources all the time.

wilj 20 hours ago|||
> I know open models have gotten quite good in many tasks such as coding or composition, but are there any that can access the internet and retrieve data like ChatGPT, Claude, etc can?

The things you describe are just tool calling, they're a feature of whatever harness you use. Use OpenCode, pi.dev, or maki.sh with any of the open models.

> I do have to admit I have recently begun wishing I could pay five dollars a month for a "just answer the fucking question" plan that would give me results without the guardrails and without the constant simpering and ego-stroking. I keep finding myself going a quick evaluation of "is it faster for me to skim search results myself or to construct an elaborate narrative to make an AI give me a real answer".

You can do most of this with some system prompts added to whatever agent you're using. You can do it from the settings on the claude/chatgpt websites too. (minus the no-guardrails thing)

newwttbreak 18 hours ago||
What are good resources and forums where I can figure out these system prompts to bypass guardrails, atleast on agents?
JSR_FDED 20 hours ago|||
Just go to kimi.com and try for yourself (not affiliated, but happy user).

First time I did this I realized in 5 seconds that the big players weren’t going to be carving up the market between them.

linzhangrun 20 hours ago|||
You can let the AI solve it itself, and then it will provide two solutions: implement a local search service (easily blocked), or purchase a Web Search API service
flexagoon 13 hours ago|||
There are tons of existing Skills/MCPs for Google/Kagi/whatever search, and making your own is trivial. I gave DeepSeek in Pi a link to Kagi API docs and asked it to add a web search skill, and it did that easily.
tr_user 18 hours ago||
isn't that just in the harness?
epolanski 5 hours ago||
I unsubscribed from Anthropic and our (EU-based) team is moving to an "ai-server" running opencode + GLM 5.2 and DS4.

There are several benefits:

- we cut AI spending by thousands

- there is one AI server and starting different sessions for each user, one memory/skills/etc and everybody is involved into reviewing what went wrong and why. Harness finally makes sense and pays off more.

- we can trust that the models are those that we run and not black boxes

- no more money flowing to US narcissistic entrepeneurs and no more business being tied to US legislation

Not gonna lie, GPT 5.5 Pro and Fable 5 were a tiny bit ahead, especially on longer vibecode-style tasks, but it's just not worth it.

impartshadow 7 hours ago||
[flagged]
cws_ai_buddy 20 hours ago||
[flagged]
fabijanbajo 14 hours ago||
[dead]
Atom_Foundry 13 hours ago||
[flagged]
c_chenfeng 20 hours ago||
[dead]
codelong888 21 hours ago||
[dead]
root_axis 18 hours ago|
Imagine taking 6 months longer to release your cookie cutter CRUD app.