Tongyi DeepResearch – open-source 30B MoE Model that rivals OpenAI DeepResearch

Posted by meander_water 1 day ago

Tongyi DeepResearch – open-source 30B MoE Model that rivals OpenAI DeepResearch(tongyi-agent.github.io)

352 points | 145 commentspage 2

mehdibl 1 day ago|

It's a Qwen 3 MoE fine tune...

brutus1213 1 day ago||

I recently got a 5090 with 64 GB of RAM (intel cpu). Was just looking for a strong model I can host locally. If I had performance of GPT4-o, I'd be content. Are there any suggestions or cases where people got disappointed?

bogtog 1 day ago||

GPT-OSS-20B at 4- or 8-bits is probably your best bet? Qwen3-30b-a3b probably the next best option. Maybe there exists some 1.7 or 2 bit version of GPT-OSS-120B

p1esk 1 day ago||

5090 has 32GB of RAM. Not sure if that’s enough to fit this model.

IceWreck 1 day ago|||

LlamaCPP supports offloading some experts in a MoE model to CPU. The results are very good and even weaker GPUs can run larger models at reasonable speeds.

n-cpu-moe in https://github.com/ggml-org/llama.cpp/blob/master/tools/serv...

svnt 1 day ago|||

It should fit enough of the layers to make it reasonably performant.

VladVladikoff 1 day ago||

Recently I gave a list of 300 links to deep research and asked it to go through each one to analyze a certain question about them. Repeatedly it would take shortcuts and not actually do the full list. Is this caused by a context window limits? Or maybe Open AI limits request size? Is it possible to not run into these types of limits with locally hosted models?

oofbey 1 day ago|

I’ve also had extremely poor luck getting any LLM agent to go through a long list of repetitive tasks. Don’t know why. I’d guess it’s because they’re trained for transactional responses, and thus are horrible at repute anything.

ukuina 21 hours ago||

Very much this.

You are better off asking it a write a script to invoke itself N times across the task list.

threecheese 5 hours ago||

Same. I think there’s an untapped market (feature really) here, which if isn’t solved by GPT-next will start to reveal itself as a problem more and more.

LLMs are really bad at being comprehensive, in general, and from one inference to the next their comprehensive-ness varies wildly. Because LLMs are surprising the hell out of everyone with their abilities, less attention is paid to this; they can do a thing well, and for now that’s good enough. As we scale usage, I expect this gap will become more obvious and problematic (unless solved in the model, like everything else).

A solution I’ve been toying with is something like a reasoning step, which could probably be done with mostly classical NLP, that identifies constraints up front and guides the inference to meet them. Like a structured output but at a session level.

I am currently doing what you suggest though, I have the agent create a script which invokes … itself … until the constraints are met, but that obviously requires that I am engaged there; I think it could be done autonomously, with at least much better consistency (at the end of the day even that guiding hand is inference based and therefore subject to the same challenges).

zwaps 1 day ago||

The OpenAI numbers are a red herring anyway.

For most plans, Deep Research is capped at around 20 sources, making it for many cases the least useful research agent, in particular worse than a thinking mode Gpt5 query

ugh123 1 day ago||

Slightly off topic but why does word wrapping seem to be broken in this site? Chrome on Android

rippeltippel 1 day ago|

Thank you for pointing that out, I was about to ask the same. It's giving my OCD a hard time reading it.

whiplash451 1 day ago||

Has anyone tried running this on a 5090 or 6000 pro? What throughput do you see?

krystofee 1 day ago||

Isnt it huge deal, that this 30B model can compare and surpass huge closed models?

Traubenfuchs 1 day ago||

It still feels to me like OpenAI has zero moat. There are like 5 paid competitors + open source models.

I switch between gemini and ChatGpt whenever I feel one fails to fully grasp what I want, I do coding in claude.

How are they supposed to become the 1 trillion dollar company they want to be, with strong competition and open source disruptions every few months?

nickpinkston 1 day ago||

Yea, I agree.

Arguably LLMs are both (1) far easier to switch between models than it is today to switch from AWS / GCP / Azure systems, and (2) will be rapidly decreasing switching costs for your legacy systems to port to new ones - ie Oracle's, etc. whole business model.

Meanwhile, the whole world is building more chip fabs, data centers, AI software/hardware architectures, etc.

Feels more like we're headed to commodification of the compute layer more than a few giant AI monopolies.

And if true, that's actually even more exciting for our industry and "letting 100 flowers bloom".

Grimblewald 1 day ago|||

Of course they dont, the only advantage it ever had was the willingness to destroy trust on the internet by scraping everything from everyone rules and expectations be dammed.

The underlying architecture isnt special, the underlying skills and tools aren't special.

There is nothing openAI brings to the table other than a willingness to lie, cheat, and steal. That only gives you an edge for so long.

red2awn 1 day ago|||

The moat of OpenAI is 1. internal knowledge they've built over the last few years building front tier models 2. their talent 3. the ChatGPT brand (go ask a random person on the street, they know ChatGPT but not Claude or Gemini)

rokob 1 day ago|||

I don’t know if they can pull it off but a lot of companies are built on strong enterprise sales being able to sell free stuff with a bow on it to someone who doesn’t know better or doesn’t care.

whiplash451 1 day ago|||

Isn’t the moat in the product/UI/UX? I use Claude daily and love the “scratch notebook” feel of it. The barebone model does not get you any of this.

hamandcheese 1 day ago||

I agree that the scaffolding around the model contributes greatly to the experience. But it doesn't take billions of dollars in GPUs to do that part.

isoprophlex 1 day ago||

Premium grade deals with Oracle. They will bullshit their way into government and enterprise environments where all the key decision makers are clueless and/or easily manipulated.

incomingpain 19 hours ago||

I love this one: https://github.com/LearningCircuit/local-deep-research

I tied it together with qwen3 30b thinking. Very easy to get it up and running, but lots of the numbers are shockingly low. You need to boost iterations and context. Especially easy if you already run searxng locally.

I havent finished tuning the actual settings, but for the detailed report it'll take ~20 minutes and so far has given pretty good results. Similar to openai's deep research. Mine often has ~100 sources.

But something I have noticed. It didnt seem to me the model was important. The magic was moreso in the project. Getting deep with higher iterations and more results.

blueboo 1 day ago|

When was the last time you did a deep research? Good agents just do research as necessary. I find GPT5 Pro >> all the top DR agents

More comments...