Top
Best
New

Posted by impact_sy 2 days ago

DeepSeek v4(api-docs.deepseek.com)
https://api-docs.deepseek.com/

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

2041 points | 1555 commentspage 2
latentframe 2 hours ago|
The 1.6T number is nice but also eye-catching and what matters most is how few parameters are active in practice, that’s what brings the most of the efficiency
maxloh 1 day ago||
They published model weights on Hugging Face. Both of them are MIT-licensed.

DeepSeek-V4-Flash: https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash

DeepSeek-V4-Pro: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

primaprashant 2 days ago||
While SWE-bench Verified is not a perfect benchmark for coding, AFAIK, this is the first open-weights model that has crossed the threshold of 80% score on this by scoring 80.6%.

Back in Nov 2025, Opus 4.5 (80.9%) was the first proprietary model to do so.

stared 2 days ago|
SWE-bench Verified is, at this point, contaminated https://openai.com/index/why-we-no-longer-evaluate-swe-bench...

So it os hard to tell how much of a model gain is due to skill, and how much - overfitting.

seanobannon 2 days ago||
Weights available here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
BoorishBears 2 days ago|
https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash-Base https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-Base

And we got new base models, wonderful, truly wonderful

yanis_t 2 days ago||
Already on Openrouter. Pro version is $1.74/m/input, $3.48/m/output, while flash $0.14/m/input, 0.28/m/output.
esafak 2 days ago||
https://openrouter.ai/deepseek/deepseek-v4-pro

https://openrouter.ai/deepseek/deepseek-v4-flash

77ko 2 days ago||
Its on OR - but currently not available on their anthropic endpoint. OR if you read this, pls enable it there! I am using kimi-2.6 with Claude Code, works well, but Deepseek V4 gives an error:

`https://openrouter.ai/api/messages with model=deepseek/deepseek-v4-pro, OR returns an error because their Anthropic-compat translator doesn't cover V4 yet. The Claude CLI dutifully surfaces that error as "model...does not exist"

nl 2 days ago|||
The Pro model is giving 429 Overload errors
XCSme 1 day ago||
Yup, can't really be used in production atm.
astrod 2 days ago||
Getting 'Api Error' here :( Every other model is working fine.
poglet 2 days ago||
Try interacting with it through the website, it will give an error and some explanation on the issue. I had to relax my guardrail settings.
vinhnx 2 days ago||
The king is back! I remember vividly being very amazed and having a deep appreciation reading DeepSeek's reasoning on Chat.DeepSeek.com, even before the DeepSeek moment in January later that year. I can't quite remember the date, but it's the most profound moment I have ever had. After OpenAI O1, no other model has “reasoning” capability yet. And DeepSeek opens the full trace for us. Seeing DeepSeek's “wait, aha…” moments is something hard to describe. I learned strategy and reasoning skills for myself also. I am always rooting for them.
buenolot 2 days ago|
Instead of King DeepSeek we got DeepShit Clown
mchusma 2 days ago||
For comparison on openrouter DeepSeek v4 Flash is slightly cheaper than Gemma 4 31b, more expensive than Gemma 4 26b, but it does support prompt caching, which means for some applications it will be the cheapest. Excited to see how it compares with Gemma 4.
MillionOClock 2 days ago|
I wonder why there aren't more open weights model with support for prompt caching on OpenRouter.
mzl 2 days ago||
It is tricky to build good infrastructure for prompt caching.
jatora 1 day ago||
Its as simple as telling your claude code to implement prompt caching!
sidcool 2 days ago||
Truly open source coming from China. This is heartwarming. I know if the potential ulterior motives.
b65e8bee43c2ed0 2 days ago||
American companies want a scan of your asshole for the privilege of paying to access their models, and unapologetically admit to storing, analyzing, training on, and freely giving your data to any authorities if requested. Chinese ulteriority is hypothetical, American is blatant.
elefanten 2 days ago|||
It’s not remotely hypothetical you’d have to be living under a rock to believe that. And the fusion with a one-party state government that doesn’t tolerate huge swathes of thoughtspace being freely discussed is completely streamlined, not mediated by any guardrails or accountability.

This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

ben_w 2 days ago|||
As a non-American, everything you wrote other than "one party" applies to the current US regime.

Relatively speaking, DeepSeek is less untrustworthy than Grok.

When I try ChatGPT on current events from the White House it interprets them as strange hypotheticals rather than news, which is probably more a problem with DC than with GPT, but whatever.

eleventen 21 hours ago||
> When I try ChatGPT on current events from the White House it interprets them as strange hypotheticals rather than new

Any specific examples?

ben_w 14 hours ago||
The Greenland incident.
oceanplexian 2 days ago||||
> And the fusion with a one-party state government that doesn’t tolerate huge swathes of thoughtspace being freely discussed

That would be a great argument if the American models weren’t so heavily censored.

The Chinese model might dodge a question if I ask it about 1-2 specific Chinese cultural issues but then it also doesn’t moralize me at every turn because I asked it to use a piece of security software.

donbreo 2 days ago||
Just ask it to "name the states in india" or "what happened in 1989"
randomNumber7 2 days ago||||
The USA has one of the highest percentages of their population in prison.

Even for minor stuff like beeing addicted to drugs.

Looks pretty totalitarian to me.

bdamm 2 days ago|||
And in China the state can harvest your organs for political crimes or even just being the wrong religion.

Not quite the same.

GordonS 2 days ago||
I think you're going to need to provide sources for such an outrageous and unbelievable claim.
rhubarbtree 2 days ago|||
I was curious as this is something commonly mentioned in all sorts of western media.

Quick google top link

https://en.wikipedia.org/wiki/Forced_organ_harvesting_from_F...

MiiMe19 1 day ago|||
[flagged]
GordonS 1 day ago||
For asking questions, on Hacker News?

I think not.

thesmtsolver2 2 days ago||||
Do you really trust China’s stats on prison population?

Note: you can have this conversation criticizing the US on a US website. Try criticizing Xi or the CCP or calling him Pooh on a Chinese website.

You think China doesn’t imprison drug users?

China recently executed a low level drug trafficker

https://www.lemonde.fr/en/international/article/2026/04/05/c...

China is one of the top executioners. China executes more than rest of the world combined

https://www.amnesty.org/en/latest/news/2017/04/china-must-co...

You think China is honest about political prisoners in Tibet and Xinjiang?

Criticize the US all you want but I can’t understand the whitewashing of a real totalitarian and genocidal state like mainland China.

randomNumber7 2 days ago|||
Both can be totalitarian. Both are shit imho. I just don't buy the argument that China is worse because of it.

But if we start nitpicking the US also executes people all over the world without trial and has secret prisons worldwide where they put people (guess what) without trial.

chronc6393 2 days ago|||
mic drop
FuckButtons 2 days ago|||
I’ll be sure to pick up my copy of the peoples daily to read about those statistics in the morning.
theshackleford 2 days ago||||
> This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

This is why I’ve been urging everyone I know to move away from American based services and providers. It’s slow but honest work.

b65e8bee43c2ed0 2 days ago||||
>This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

yes, this is exactly what I'm saying.

danny_codes 2 days ago||||
It’s an open model? So you can run it yourself if you want to
casey2 2 days ago||||
Thousands of years with no invasions, hundreds of years with thousands of invasions.

China is a nation built for peace, while western nations are built for war.

Paradigma11 1 day ago|||
The Dzungar would like to have a word with you, oh wait.
niek_pas 2 days ago||||
Hong Kong? Taiwan? Uyghurs? Tiananmen Square? Tibet?
varrakesh 2 days ago||
China hasn't done anything with Taiwan other than saber-rattling. Hong Kong, Xinjiang, etc. are all part of China.

The US is (mostly) protective of its citizens but (depending on administration) varyingly hostile to outsiders (immigrants, starting wars, etc.).

China is suppressive towards its own citizens, but has been largely peaceful with other countries and immigrants/visitors. (Granted, China has way fewer immigrants than the US, so this is not comparable).

resonancel 1 day ago|||
I believe China only got this huge because all its neighours couldn't help joining the peaceful middle realm \s
michaelt 2 days ago||||
The oppression of people in China like Uyghurs and Hong Kong, the complete lack of free speech, the saber-rattling at neighbours, and the lack of respect for intellectual property are indeed all well documented.

But for folks on the opposite side of the world, the threats are more like "they're selling us electric cars and solar panels too cheaply" and the hypothetical "these super cheap CCTV cameras could be used for remote spying"

t0lo 2 days ago|||
And you're saying Americans aren't banned from criticising their elites?
resonancel 1 day ago|||
Come back when Americans are routinely jailed for rubbing their elites the wrong way (in some countries, criticisms aren't the only way to rub the leaders the wrong way)
rhubarbtree 2 days ago||||
Donald trump is a terrible president and looks like Winnie the Pooh. Keir Starmer is useless and a liar.

Feel free to go post similar on Chinese social media about their leaders.

jatora 1 day ago|||
This. America is an oligarchy. The political system is a joke facade with a revolving door to corporations. Your vote is meaningless, you dont actually have a choice. Media brainwashes the swaths.... but thought crime still isnt a reality here.
littlestymaar 1 day ago||||
This would have worked a few years back, but now you can be detained at the US border for posting what you just did so it's a terrible example to pick.

By the way, even with the current administration, there's no question about which is the more authoritarian with their own citizens between China and the US. But if you aren't American, then the US government is much more of a threat than the Chinese.

China cannot make the life of an official in Europe miserable for investigating their atrocities towards the Uighurs, meanwhile CPI judges are now forcedly unbanked and cannot work with American software because they investigated in US's ally's atrocities in Gaza.

rhubarbtree 1 day ago||
> China cannot make the life of an official in Europe miserable for investigating their atrocities towards the Uighurs

Sure. China and America are the same. Go try the social media experiment.

littlestymaar 1 day ago||
I literally wrote the opposite, but ok…
nibman 1 day ago|||
[dead]
tommica 2 days ago||||
Pretty sure you guys have a strong laws about free-speech, and criticizing elites is part of that. Though there are some groups that do not really want the 1st amendment to be a thing.
ben_w 2 days ago||
> Though there are some groups that do not really want the 1st amendment to be a thing.

The executive branch?

tommica 2 days ago||
That would be a naïve perspective.
mjamesaustin 2 days ago||
Foreigners are literally being denied entry into the country due to opposing viewpoints expressed on social media. People have to disable FaceID on their phones prior to going through customs in case an agent decides to investigate whether their political views are in opposition to the current administration.
xienze 2 days ago|||
> And you're saying Americans aren't banned from criticising their elites?

Half the country would be locked up right now if they weren’t allowed to criticize Trump. Have you even paid attention to how much he’s shitted on, on a daily basis?

mwigdahl 1 day ago||||
I, personally, have never been asked for an asshole scan, but I'm interested in providing one if you can point me to a company that's offering.
simplesocieties 1 day ago|||
It's clear the OC was using hyperbole but we're honestly not too far off. Just a few examples:

- Sam Altman & Worldcoin collecting everyone's eyeball scan - Discord attempting to roll out worldwide age & id verification - LinkedIn collecting data on your web browser extensions - WhatsApp collecting browser data via a local server running on device

MiiMe19 1 day ago||
And the Chinese are somehow better here? You need id to play video games for more than an hour.
93po 1 day ago|||
GoatseAI - the type of open that OpenAI should have been from the start
surgical_fire 1 day ago||
Have my upvote and go away.
thesmtsolver2 2 days ago|||
As someone with Tibetan friends and as someone from India, Chinese ulterior motives are way more clear.
mordae 2 days ago||
Same as USA. Happy to see some competition.
Quothling 2 days ago|||
It's a little sad that tech now comes down to geopolitics, but if you're not in the USA then what is the difference? I'm Danish, would I rather give my data to China or to a country which recently threatened the kingdom I live in with military invasion? Ideally I'd give them to Mistral, but in reality we're probably going to continue building multi-model tools to make sure we share our data with everyone equally.
jatora 1 day ago||
Lol EU pats you on the head

Its sad to see how you have regulated yourselves into a position where Mistral is your only claim.

spaceman_2020 2 days ago|||
I don’t care about whatever “ulterior motives” they might have

My country’s per capita income is $2500 a year. We can’t pay perpetual rent to OAI/Anthropic

djyde 2 days ago||
Same
try-working 2 days ago|||
if you want to understand why labs open source their models: http://try.works/why-chinese-ai-labs-went-open-and-will-rema...
wraptile 2 days ago||
> Internet comments say that open sourcing is a national strategy, a loss maker subsidized by the government. On the contrary, it is a commercial strategy and the best strategy available in this industry.

This sounds whole lot like potatoh potahto. I think the former argument is very much the correct one: China can undercut everyone and win, even at a loss. Happened with solar panels, steel, evs, sea food - it's a well tested strategy and it works really well despite the many flavors it comes in.

That being said a job well done for the wrong reasons is still a job well done so we should very much welcome these contributions, and maybe it's good to upset western big tech a bit so it's remains competitive.

try-working 2 days ago||
It is not only that Chinese labs can undercut on price. It is that they must. They must give away their models for free by open sourcing them, and they must even give away free inference services for people to try them. That is the point of the post.
FuckButtons 2 days ago||
There is not ‘must’ here, they did not ‘have’ to undercut every other strategically and technologically important industry the rest of the world has, but they did as a point of national policy.
vessenes 2 days ago|||
‘Have to’ and ‘every other’ are both doing so much work here that I think your worldview on this is likely just incorrect.

The decisions to mobilize a large rural base toward manufacturing and the central bank goals to keep the yuan cheap as a critical support of this project were absolutely national.

They were ultimately about bringing (or trying to bring) one of the most populous nations in the world out of extreme poverty; in particular the people of the country out of extreme poverty.

There are different policies in place today, and, crucially, bleeding edge tech is not gainful labor employment —- BYD has some factories with roughly 2 employees per acre of robotic production, for instance. Or datacenters where the revenue could scale but the labor will not.

So, these are different times, different goals, different political and labor outcomes. Reasoning about what China “must do”, or has as a matter of “national policy” should start with a clear look at history and circumstance, or you’re likely to read things incorrectly.

try-working 2 days ago||||
No. Read what I wrote. I have spent a decade in the Chinese tech industry.
Danox 2 days ago|||
American industry has been on a downward spiral since the early 1960s….
FuckButtons 2 days ago||
I’m not claiming it hasn’t been, but if you would look around, it’s not just the USA this has impacted.
I_am_tiberius 2 days ago|||
Open weight!
alecco 2 days ago|||
Please don't slander the most open AI company in the world. Even more open than some non-profit labs from universities. DeepSeek is famous for publishing everything. They might take a bit to publish source code but it's almost always there. And their papers are extremely pro-social to help the broader open AI community. This is why they struggle getting funded because investors hate openness. And in China they struggle against the political and hiring power of the big tech companies.

Just this week they published a serious foundational library for LLMs https://github.com/deepseek-ai/TileKernels

Others worth mentioning:

https://github.com/deepseek-ai/DeepGEMM a competitive foundational library

https://github.com/deepseek-ai/Engram

https://github.com/deepseek-ai/DeepSeek-V3

https://github.com/deepseek-ai/DeepSeek-R1

https://github.com/deepseek-ai/DeepSeek-OCR-2

They have 33 repos and counting: https://github.com/orgs/deepseek-ai/repositories?type=all

And DeepSeek often has very cool new approaches to AI copied by the rest. Many others copied their tech. And some of those have 10x or 100x the GPU training budget and that's their moat to stay competitive.

The models from Chinese Big Tech and some of the small ones are open weights only. (and allegedly benchmaxxed) (see https://xcancel.com/N8Programs/status/2044408755790508113). Not the same.

patshead 2 days ago|||
DeepSeek's models are indeed open weight. Why do you feel that pointing this out would be considered slander?
culi 2 days ago|||
I think they were reading GP's comment as a correction. Like "not open-source, just open weight". I'm not sure if their reading was accurate but I enjoyed their high effort comment nonetheless
alecco 1 day ago||
X is full of "open weights!" corrections as a dog whistle by the anti-China crowd. And they are right about models from the Chinese Big Tech, but completely wrong about DeepSeek.
alecco 1 day ago|||
>> Truly open source coming from China.

> Open weight!

They clearly were implying it's not open source.

patshead 1 day ago||
Correct. We have open-weight models from OpenAI, Facebook, Mistral, DeepSeek, Z.ai, MiniMax, and all sorts of other companies. Most of them have fantastic and open licensing terms.

If we can't build the weights, then we don't have the source. I'm not entirely sure what an open-source model would even look like, but I am confident that these binary blobs that we are loading into llama.cpp and vllm aren't the equivalent of source code. We have absolutely no idea what sort of data went into them.

This is fine. It isn't slanderous. It is what we have, and it is awesome. Just because it is awesome doesn't make it open source.

kortilla 2 days ago|||
It’s not slander to say something true. These are open weights, not open source. They don’t provide the training data or the methodology requires to reproduce these weights.

So you can’t see what facts are pruned out, what biases were applied, etc. Even more importantly, you can’t make a slightly improved version.

This model is as open source as a windows XP installation ISO.

alecco 2 days ago||
> These are open weights, not open source.

Did you even read my comment?

jatora 1 day ago||
I did. Show me the source code.
alecco 1 day ago||
> DeepSeek is famous for publishing everything. They might take a bit to publish source code but it's almost always there.

they-might-take-a-bit-to-publish

0-_-0 2 days ago|||
Weights are the source, training data is the compiler
crazylogger 2 days ago|||
Training data == source code, training algorithm == compiler, model weights == compiled binary.
0-_-0 2 days ago||
Training algorithm is the programmer, weights are the code that you run in an interpreter
ngruhn 2 days ago|||
isn't it more like the data is the source, the training process is the compiler, and the weights are the binary output.
zerr 2 days ago|||
Do they also open-source censoring filter rules? Like, you can't ask what happened at Tiananmen Square in 1989.
harladsinsteden 2 days ago|||
> I know if the potential ulterior motives.

And you think the US tech giants don't have any ulterior motives?!

FuckButtons 2 days ago||
I think their motives are pretty transparent, as are china’s, as ever, you have to pick the lesser of two evils.
neonstatic 1 day ago||
How are the "ulterior motives" of Chinese companies any worse than "ulterior motives" of US companies or European ones?
yanis_t 1 day ago||
Assuming it is almost as good as Opus 4.6 (which benchmarks seem to give evidence for), and assuming we are having a good enough harness (PI, OpenCode), it's is now more than 5x cheaper.

I just want to remind you that this is happening at the same time as Anthropic A/B tests removal of Code from Pro Plan, and as OpenAI releases gpt-5.5 2x more expensive than gpt-5.4...

stingraycharles 1 day ago|
> Assuming it is almost as good as Opus 4.6 (which benchmarks seem to give evidence for)

That’s a big if. It’s my experience that models that perform very well on benchmarks do not necessarily perform well in real life.

I’ve mostly started ignoring the benchmarks and run my own evals.

ting0 1 day ago|||
> It’s my experience that models that perform very well on benchmarks do not necessarily perform well in real life

Well, yeah... Like Opus 4.5, 4.6, 4.7. Top of the benchmarks and yet it's a pile of crap at the moment and has been for months.

jatora 1 day ago|||
If benchmarks are all to be believed then gemini 3.1 and grok 4.2 are still in the lead pack. A laughable notion to anyone who has actually tried to use them and compared.
LZ_Khan 1 day ago|
It's easy to praise Deepseek for its results and generosity -- how they can keep up with frontier labs on Huawei chips for a fraction of the cost! -- but let's not forget a big part of their toolkit is heavy distillation of SoTA.
copypaper 1 day ago||
Let's also not forget SoTA models stole from us.
gordonhart 1 day ago|||
True, and they're being tried in a federal court of law for it. NYT v. OpenAI is still very much alive, these things just take a while. Can the same be said about DeepSeek or any other open-source model provider performing distillation?
copypaper 1 day ago|||
Pandora's box has already been opened and there is no going back. I doubt OpenAI, et al will get anything but a slap on the wrist in court because punishing AI companies would have a negative effect on the US economy.

>Can the same be said about DeepSeek or any other open-source model provider performing distillation?

Open source models that distill from SoTA reminds me of the story of Robin Hood -- robbing the rich and giving it to the poor. So to answer your question: yes, but it's better than the alternative where only a select few companies have SoTA models.

gordonhart 1 day ago||
Robin Hood, famous for spinning his acts into a $220M ARR SaaS business (as of mid 2025 [0], likely >$1B by now) and using charity as a marketing mechanism.

[0] https://sqmagazine.co.uk/deepseek-ai-statistics/

copypaper 1 day ago||
touché hahah. Are there any SoTA open-source models that don't have corporate interest?
riskd 1 day ago||||
You already know what the results of this “trial” will be. Let’s not pretend.
paweladamczuk 1 day ago|||
>these thing just take a while

Oh, so people might be forced to give back the AI earnings? Should I be worried about the last year's capital gains on my portfolio?

vatsachak 1 day ago||||
Literally.

Altman and Amodei are so mad about muhh model when they steal our data and pollute the Internet with slop.

93po 1 day ago|||
let's not forget that calling copyright infringement theft is hyperbole, and the claim that AI is even infringing is also dubious at best, and that the concept of intellectual property at all is also ethically dubious
MiSeRyDeee 1 day ago|||
So they distill the sota model where OAI/Anthropic illegally stole from public, and open weights to us or sell their API at 1/50th of the price? I'd say keep up the good work and distill more!
hamdingers 1 day ago|||
I could not possibly care less if I tried. Every LLM is a distillation of something else.
seydor 13 hours ago|||
All AI software is built on open source. They are just giving back what they should
orbital-decay 1 day ago|||
What's the evidence?
slopinthebag 1 day ago||
Who cares? Also Anthropic does the same thing - if you ask it who it is in Chinese it says it's DeepSeek LOL

https://x.com/teortaxesTex/status/2026130112685416881

More comments...