Vulnerability reports are not special anymore

Posted by goranmoomin 7 hours ago

Vulnerability reports are not special anymore(words.filippo.io)

211 points | 115 comments

themanmaran 6 hours ago|

I feel like it's also been overrun by a lot of spam. As someone running a company, I get 2-5 unsolicited "vulnerability reports" per week. Half of them are an LLM finding some bad CSS on our framer splash page. The other half I assume are an extortion attempt so we just mark as spam.

Occasionally I see real security researchers on HN complaining that no one takes the disclosure seriously, or that people reply immediately with a cease and desist. But from the receiving end it's just because the spam is unmanageable.

Gigachad 5 hours ago||

I'm getting CVE fatigue with all of these super ultra critical 10/10 vulnerabilities that are some node package that compiles my frontend can get stuck if I give it a malicious regex.

It's hard to spot the stuff that actually matters.

jamesfinlayson 3 hours ago|||

Yep. I remember years ago seeing the website for some guy who proudly listed all the CVEs he'd discovered. Clearly he'd written some scanning tool to look at regexes in open source projects and was creating CVEs for anything that might result in exponential time execution or whatever.

gorgoiler 1 hour ago|||

It sounds like an interesting case study. Do these things get reported with a patch?:

(a) add a new function that does regular expressions searching / matching with a resource checker (eg a timer);

(b) write a local linter that reports an error for any use of the builtin regular expression tools;

(d) commit the linter.

edelbitter 1 hour ago|||

This stuff has been brewing for years, but since technically you could fix all instances with minimal StackOverflow downtime [1] and a slightly different pattern, few people worked on either using engines with data structures less prone to the worst case or adding the generic workarounds for those that have them.

e.g. in cPython, until 3.11, there was no support for atomic grouping (roughly translation: "never backtrack inside of this expression"). There is little useful advice a linter can give, if there is no predictable-runtime way to express what you want within a single match step, because you really do want to unwind the stack and check for repeats (just without any of the exponential runtime stuff, please).

[1]: https://meta.stackoverflow.com/questions/328376/why-does-sta...

jamesfinlayson 1 hour ago|||

No I think he was just looking to raise his profile, not to help.

tryauuum 1 hour ago|||

That's a real issue, took cloudflare down once...

swiftcoder 21 minutes ago||

It's only a real issue if it is in runtime code that parses untrusted input. 99% of the regex lints/CVEs that get flagged our way are in build-time code.

themanmaran 4 hours ago||||

Seriously. We got 116 github dependabot alerts this week. Half of them for dev dependencies.

thomashabets2 16 minutes ago|||

I got reminded every week that my static site generator "Jekyll" is insecure.

Ok. Hacking me by changing the input to my Jekyll rather involves being on the other side of the airtight hatch.

jamesfinlayson 3 hours ago|||

I tried to raise that with my internal security team recently - don't clutter my vulnerability dashboard with issues in dev dependencies. They somewhat rightly pointed out that malware needs to be dealt even if it's a dev dependency. So my suggestion went nowhere because I guess we can't filter by type of vulnerability.

Quothling 2 hours ago|||

Working in the EU energy sector where we have to work with NIS2 compliance, I'd argue that your security team rightly pointed it out. I suspect that's what you mean though, and the rightly is just there because you agree with it but don't like it. We work with even more tight dependencies policies than just having alerts. We have a set of pre-approved and yearly vetted packages, like pandas or pyarrow for Python data work. Aside from that we have some isolated development environments where your pipeline can get access to something like SQLC for Go. Which is essentially where your dev dependency lives in it's own environment where it can produce the code it needs to and then submit it for approval into your regular dev environment.

Ironically we'd probably need to run Dependabot itself in a mirrored environment since it too has external dependencies we'd probably not want to vet.

I do think external dependencies are among our biggest security threats though. It's so hard to vet them, and compliance basically comes down to "We trust the apache software foundation enough, and pyarrow is vital to our business, so we accept the risks", and then you lock versions and aren't the first to update except for vulnerabilities. Shadow AI is obviously the number one security threat right now, especially in enterprise with people who are very tech savvy. This makes dependencies so much worse though, because now everyone can (if their systems aren't locked down tight) do so many crazy things. Both with the "non-sanctioned" AI but also with the code it can generate for them.

jamesfinlayson 1 hour ago||

Yeah I completely understand their intent, but I might get 30 vulnerabilities across a multiple repos flagged in a week. It is already tedious to check them all and assess if they're worth worrying about let alone having to update them. These are 99% Javascript though - I suspect other ecosystems are much more manageable.

Gigachad 1 hour ago||

It's easier to keep stuff up to date these days. If you have a project with typescript, unit tests, and end to end tests like cypress you can just have dependabot create the PRs to update packages. If everything passes you just have to hit the merge button.

Just updating everything is probably easier than assessing if it's possible to trigger an exploit with the way you use the package.

froddd 1 hour ago|||

This is exactly how developers of malware want you to behave. Update without really thinking about it.

I do wonder how long it will take before an attack is developed by submitting a semi-genuine vulnerability, shortly followed by a ‘fix’ including malicious code.

ThreatSystems 20 minutes ago||||

In agreement with frodd above.

Dependencies and supply chain attacks are probably the greatest risk to a lot of software orgs, as they run them across all their environments: Development (with secrets and other valuable artefacts on developer VMs), CI/CD pipelines which may have access tokens to production (and other) environments, and production itself.

Notably even security companies are being impacted by this[0]. The scale of these attacks has amplified quite significantly the past three years, but are not solely exclusive to the javascript ecosystem [1] or even just namesquatting/typosquatting [2].

The resolution is broader security awareness, "onion layered" security controls and implementing simple non-burden inducing processes and policies. Sometimes not updating (what was wrong with the previous version of a dependency if there was no immediate vulnerability or production issue caused by it?) or having a two week cool down for updates (which some supply chain tooling natively supports) can appease some security functions through clear communication of the supply chain risk etc.

If anyone has interest in courses aligned to your org on improving developer and broader engineering management awareness on this, e-mails in my profile :).

[0] - https://socket.dev/blog/ongoing-supply-chain-attack-targets-...

[1] - https://orca.security/resources/blog/hades-pypi-supply-chain...

[2] - https://checkmarx.com/zero-post/python-pypi-supply-chain-att...

Maxion 59 minutes ago|||

Yep this is what has happened to small teams. You really only have time to approve the dependabot changes and go go go. Otherwise you'll never get anything productive done.

The other option is to simply ignore updates and do them on a schedule, e.g. once every 1-2 months.

SkiFire13 31 minutes ago||||

Everyone talking about malware in dev dependencies as if dependabot only raises issues about that, but it does not. It raises warnings about all sort of "vulnerabilities" irrespective of the threat model.

Even worse, it incentivizes randomly updating dependencies, which is what actually allows supply chain attacks.

funciton 21 minutes ago||||

Developer's machines and cicd systems are high value targets. They were absolutely right to point that out.

WD-42 1 hour ago||||

A lot of the recent npm attacks have been exfiltration from dev machines, which would just as likely from dev dependencies.

zmgsabst 2 hours ago|||

Dev dependencies is how they compromised SolarWinds and thereby most of the US federal government.

> The attackers used a supply chain attack. The attackers accessed the build system belonging to the software company SolarWinds, possibly via SolarWinds's Microsoft Office 365 account, which had also been compromised at some point. SolarWinds was using build management and continuous integration server TeamCity provided by the Czech company JetBrains. In 2021 The New York Times stated that unknown parties apparently embedded malware in JetBrains' software and through this way compromised also SolarWinds.

https://en.wikipedia.org/wiki/2020_United_States_federal_gov...

I don’t know what kind of software you write, how valuable your company’s infrastructure is, etc. But supply chain and insider threat in security/infrastructure is a big topic — that I’m sure they’re concerned about because that’s their area of responsibility.

Even if I’m personally sympathetic to not wanting to deal with the churn of dev dependency updates.

tempay 2 hours ago|||

This is very real, but such CVEs are such a tiny fraction in relation to denial-of-service-due-to-regex that it’s hard to take the system seriously.

So far as I’m concerned the solution is to isolate everything as much as possible. I’d love to see something on the CVE classification side to also address the signal to noise problem but I don’t see it happening.

Maxion 58 minutes ago|||

These DoS Regex 10/10 CVEs in some minor helper function in some package that is used once in some random side code pathway are so damn annoying.

If I could filter out DoS CVEs‚ I would.

jamesfinlayson 1 hour ago|||

Pretty much - I don't know too much about the CVE process but if ReDoS stuff was flagged at the CVE level as "exploitable only with unconstrained inputs" then great - I know my tests have sane inputs, so I'll close thanks.

technion 54 minutes ago|||

Vulnerable dependencies are very different to compromised or backdoored dependencies though. Noone's taking over Solarwinds because their build tools had a ReDOS involving input from their own config files.

WD-42 1 hour ago||||

Time to start banning those that submit fake or superficial reports. Maybe with enough bans these people will start actually reading their own vulnerabilities.

nikanj 1 hour ago||||

CVE 10 if you use you current version of Python to serve files over ftp, and parse the incoming files using the most obscure file type found in the forbidden libraries of the Vatican

And your ISO etc certificates make this CVE mandatory priority 1 action point

edelbitter 1 hour ago||

I think this one has more to do with excessive dependencies, and lack of splitting into individually installable packages and/or static linking.

I have already avoided having to evaluate whether I am affected by some issue because I added patches at startup that crash before certain unused-yet-installed modules are to be loaded. Also, for those Python packages that still have a pure version that defers to stdlib and a separate muh-performance binary option with statically linked dependencies, I can generally just install the former and skip the version bumps for dependencies. The performance advantage may be negligible or negative outside of benchmarking 100k calls.. of code actually called 11 times a day, on a non-critical path.

teaearlgraycold 4 hours ago||||

Not sure what dumbass out there is marking those as 10/10. A 10 should be an auth bypass or RCE. Not a crashed build in my CI.

stackghost 2 hours ago||||

The common thread of late really seems to be the node ecosystem

iririririr 1 hour ago||||

we killed the curators.

(besides cve, nist, et al drop in criteria) searching for an indepth analisys, you find one million (after scrolling the Ai summary) results that are either copy-pastrle or Ai rewording of the cve announcement.

...and don't get me started on the proofs that stop after smelling the attack vector. you can't evaluate if your setup is DoSable at most or full remote shell.

there's still tons of good analisys and reports. but the noise....

dirkc 26 minutes ago|||

[dead]

cleverfoo 5 hours ago|||

Same experience here. I've run a successful vulnerability disclosure program for over a decade and paid out thousands of dollars in bounties for scanii.com (a malware identification API service), but recently (since the beginning of the year), we went from receiving maybe 5 per month to receiving 5 per day. These are clearly AI-generated and extremely low quality (albeit well-written). The rules of the program aren't read, and it's clearly a “point-and-click to a website" and file a report. I'm now considering just shutting down the program since, as the OP pointed out, if you found this vulnerability using an AI tool, they are inherently public. I haven't gone that far yet but have instituted some new rules aiming at filtering out most of the reports: 1- No AI-generated report and 2 - Reports must include a video of the exploit. You can see our program rules here: https://docs.scanii.com/article/131-does-scanii-have-a-secur...

zulban 2 hours ago|||

What if... on the vulnerability report rules page there's an image of some text saying something like "your report must include the text: turtle123". Reports without that text get automatically deleted.

Sure - modern AI can figure that out, but I bet in a vast majority of cases they won't.

wepple 1 hour ago||

Reminds me of someone (well known in their field) who charged $0.05 for using their “contact me” page. A trivial amount for someone who genuinely wanted to contact them, but just high enough to prevent any kind of scaled abuse

alfirous 29 minutes ago||

That actually great idea. What payment method or processor used?

lemagedurage 3 hours ago|||

Have you considered requiring a small payment for vulnerability disclosure? Refund it on payout. This should be very effective at deterring spammers. It also sucks for real reports, but beats shutting down the program entirely.

inigyou 3 hours ago||

Why would anyone pay money to have a chance of being arrested?

lemagedurage 3 hours ago|||

If a vulnerability disclosure program has a good track record of paying out, and legitimate reports get refunded, why not?

Again, the alternative might be shutting down the program entirely.

dns_snek 1 hour ago|||

Those are 2 big "ifs". The incentives are completely misaligned and the platforms work for the companies. They would now have an even bigger incentive to stonewall and close valid issues than they did before.

They already like blurring the lines by rejecting reports that have clear reproduction scripts, videos, demonstrable (but not critical) impact. They'll close it as "not a bug" but then also forbid disclosure and stonewall mediation requests. Reports are supposed to be kept private until the issue is fixed but the system gets abused to cover up issues long after they've been fixed.

In some cases I strongly suspect it's to evade liability for financial damages that their customers might've suffered. Platform mediation always takes their side and if you want to do what's right, you will get banned.

cleverfoo 2 hours ago|||

It's not a horrible idea... the challenge there would be making that payment/refund flow totally transparent in order to build trust and be fair to the researchers.

ozim 1 hour ago||

Making, payment/refund setup is more complicated than „set and forget”.

First question: Do you keep money for shit reports?

Well no, you have to pay it back like credit card validation. There is no pain for posting shit report just inconvenience. There is no legal way where you can keep the money.

MarkusQ 2 hours ago||||

Sure, it sounds dumb when you say it like that.

But do you know how many people are doing things that are even dumber right this very minute? I don't know either, but I'm sure it's larger than either of us would like to admit.

fouc 1 hour ago|||

why would anyone accept bounty money to have a chance of being arrested?

gucci-on-fleek 2 hours ago|||

Yeah, I help review security reports for a small FOSS organization, and someone reported a "critical" vulnerability about a publicly-accessible SVN server. Like yes, that is indeed the purpose of hosting open source software. But at least that report was obviously bogus; much worse are the ones that look legitimate at first, so you have to read through dozens of AI-generated paragraphs to make sure that there's nothing valid hidden in there.

saaspirant 2 hours ago||

I use AI to read such emails!

swiftcoder 23 minutes ago|||

We also get unsolicited vulnerability reports from companies trying to poach our annual pentest contract, which is... a tad grey ethically-speaking

abrookewood 4 hours ago|||

I believe the term is Beg Bounties and they are constant and annoying.

jacobgold 3 hours ago|||

I hated these low-effort reports, so I created a simple automation that checks my security inbox, mentions me in #security on Slack for things that look legitimate so I see them quickly, and marks things that seem entirely automated as spam.

I still check the spam folder for legitimate emails, but so far there haven't been any false positives.

wolfi1 2 hours ago|||

but why would you answer with a C&D if you are overwhelmed? provided, it's not always the same person?

spoaceman7777 4 hours ago|||

Have you considered having an agent, or just a model, classify/triage them for you? Modern problems require modern solutions.

ActorNightly 2 hours ago||

Its been like that for half a decade across all software. People act like finding a linux kernel bug is a big deal, completely ignoring the fact that in order to exploit that bug, the attacker has to be able to run code on your computer in the first place, which is extremely hard to do these days remotely.

Also people ironically just DGAF that much. The last actual bad exploit was log4shell in java, which given how it was introduced (i.e someone purposefully at Apache made it so a log statement can execute code, and nobody questioned it before pushing it to prod), should have been the signal for everyone to completely remove all Apache libraries from their services, but yet all the software is still being used.

Tepix 2 hours ago||

These bugs are indeed important, you need them once you‘ve found a bug in an application.

aetherspawn 37 minutes ago||

LLMs find more vulnerabilities than people because people time is heaps more expensive than LLM time, that’s it.

We’ve always been able to find heaps, we’ve just never had the right structures to put in the effort and renumerate people for looking (even if they don’t find anything).

socalgal2 4 hours ago||

I feel like the current situation is temporary. LLMs are finding all the bugs. LLMs are also help fixing most of the bugs. Once most of the bugs are fixed, LLMs should be good at finding bugs before shipping them, the stream of bug reports will die down, and we'll be back to vulnerabiltiy reports being special.

Further, the fact that bugs are so easy to find by LLMs means there is strong incentives to find ways to minimize creating bugs in the first place. That could be new or better languages, less 3rd party dependencies, more vetted code, better linters, better fuzzers, whatever. The point the new reality of bugs being easy to find will, actually must, lead to less bugs eventually because the world can't function with easy to find bugs.

bostik 59 minutes ago||

"Temporary" can be an awfully long time. There is ample evidence that discovery rate of bugs (many of which can be bucketed into vulnerabilities) in any non-trivial piece of software is more or less stable.[0] In a recent podcast episode the ex-CISO of Adobe commented that every now and then they'd take a sustained squeeze to find all occurrences of a given type of bug (ie. source of vulnerability) in a codebase. They'd find a good amount of them and fix them.

Then a year or two later they'd repeat the operation and they'd find about the same amount of same types of bugs. In many occasions in code that had been in place in the previous round and had remained essentially untouched.

Paraphrasing what the Gruqg has quipped - a large piece of software has infinity bugs. Infinity minus N is still infinity.

0: Discovery rate with regards to the time spent looking for bugs. LLM-powered bug hunting has amped up the speed with which code bases can be investigated.

MattPalmer1086 26 minutes ago||

Ahhh - you are talking about Adobe. I always wondered, given the never ending stream of vulnerabilities in their products, what it was about their development process that produced such appalling code in the first place.

zemblanKing 2 hours ago|||

I feel this sentiment is wishful thinking,but I want to start by saying I hope it turns out to be correct.

I find that often bugs will be created when using an LLM, like others have said. Saying that this can then be fixed by identifying all the bugs created by an LLM with an LLM doesnt guarantee another bug is not introduced when the LLM is addressing the initial problem.

Also, what if the LLM has a blind spot. They certainly also could be incapable of finding or fixing a bug. They dont pass any benchmark at 100% right now. Also also, guaranteeing there are no bugs in your code is like saying you have 100% test coverage, all of the tests pass, and they are written perfectly. Saying that you can simply identify and fix the bugs also assumes there is enough time and energy to find all of the bugs that exist within a project and then to address them. Even LLMs use time and energy. In a sufficiently complex system that is certainly wishful thinking.

Considering the size and complexity of a lot of modern software (like web browsers, 3d modelling software, game engines, etc.) software is just too complex to not have bugs even when created and managed by LLMs.

There will continue to be bugs in code and we will simply have to live with the fact that LLMs make it easier to exploit computer systems. I mean consider a hardware bug like Spectre [0]. If bugs like this become easier to find does that mean our existing hardware will just become obsolete more quickly? that type of problem can be addressed, but at quite a high cost.

Not sure what all of this means for the future.

0. https://en.wikipedia.org/wiki/Spectre_%28security_vulnerabil...

socalgal2 1 hour ago||

If LLMs can trivially find bugs, then they can trivially find bugs. If they can't find any bugs that doesn't mean there are no bugs but it suggests that others can't easily find them either. So the LLMs find all the bugs problem is fixed by asking the LLMs to find them before you ship.

Read what wrote, I didn't say your program will be bug free. I said, if the LLM can trivally find the bug it will. If it can't then we're at worst, back to the state of before LLMs could find bugs, but likely much better since we fixed so many of them

So, the fact that LLMs can trivially find bugs is enough to get the bugs fixed.

You, and several others, seemed to think I was saying LLMs would fix all the bugs. I never said that. I said they'd help. Finding them is help. Writing a possible fix is often help. Writing a possible fix and seeing if they can detect a bug after the fix is applied is also help. Automating the entire things and letting LLMs fix them without review is likely not help.

zemblanKing 1 hour ago||

Ok, I get a better idea of what youre saying from this reply than your original comment. It wasn't helpful to me that you suggested I reread your original comment.

I agree that LLMs make finding bugs take far less time and energy. I also agree that this should mean in the long run there are less trivial to find bugs IF everyone adopts the usage of LLMs while writing and reviewing code.

It does also seem possible that LLMs are better at finding bugs than fixing them.

mackenney 3 hours ago|||

That supposes that LLMs can write secure software. Also, if we assume that finding bugs is easier that not creating them (reasonable I would say), the supply of bugs will never be exhausted.

jeremyjh 2 hours ago|||

How can it be easier to find them than to not create them? Whatever you do to find them, you could do before you release.

xboxnolifes 2 hours ago||||

What's the difference between finding bugs and not making? Just run the bug finding in during CICD.

socalgal2 3 hours ago||||

It does not suppose that LLMs can write secure software

zulban 2 hours ago|||

> That supposes that LLMs can write secure software.

I think we're at the point that the best LLMs can indeed write software that's far more secure than your average programmer. Partly because the average is so terrible.

iririririr 56 minutes ago|||

thats is definitely NOT what the article says.

Are you making a counterpoint that the reports are so good and must all be addresses, but the problem is "llm finding all the bugs" so fast us poor slow humana cannot keep up?

because if so, i suggest you write a new article.

fajmccain 4 hours ago||

Lol you think LLMs are generating bug free code?

socalgal2 3 hours ago||

I never said that. I said they are good at helping fix them. Go read the bug reports on firefox, or Safari, or Chrome. Most of them have a fix. It might be wrong but it usually points in the right direction, which is a 1000x more than nearly all human bug reports. So, the LLM helps. which is all I stated.

cadamsdotcom 5 hours ago||

Security through obscurity was never a great strategy.. and now it’s not a strategy at all..

Hopefully at the end of this decade, a ton of software practices have been overhauled to eliminate classes of problems. Memory-safe language use is a great start - but it’d be great to see innovation in checking for TOCTOU problems, improper/missing authn & authz, and many others.

This is an engineering problem. It won’t be solved by models that “only do dumb shit 1/10th as often, only 0.01% of the time now not 0.1%!” It won’t be solved by adding more models to do even more double-checking before and after the work. It won’t be solved by hoping humans catch it in review. It isn’t solvable by adding outer loops of any sort - though we may get close. To truly solve this will take serious CS research.

whimblepop 4 hours ago||

Almost never do software companies even attempt to design secure systems. I'm not sure this requires new fundamental research so much as slightly giving a shit.

inigyou 3 hours ago||

There is a reason Mythos only found one bug in curl and it wasn't very bad.

user3939382 5 hours ago||

Verifying correctness of an implementation is P NP, not serious CS research.

adrianN 5 hours ago|||

Most verification is undecidable, lots of it is pspace complete. That doesn’t mean very much in practice since those are worst case bounds. People regularly solve problems that are undecidable for all practical instances that they care about.

bawolff 5 hours ago|||

Verifying behaviour of an arbitrary program is uncomputable. However that doesnt mean you can't have proofs of behaviour of specific programs you create.

Personally i have some doubts, a lot of research has gone into the idea without much to show for it, but its a very reasonable research area.

codebje 2 hours ago|||

There's lots of things to show for the research!

Part of what the research shows is that correctness-by-proof has a cost in developer effort.

If there really is a vulnerability-apocalypse due to AI, and it's not just a different flavour of AI hype, the cost of having insecure software will rise to the point that the cost of dealing with insecure or incorrect code at time of creation becomes less than the cost of ignoring it until it blows up.

I doubt it'll rise so much that we'll want to face the cost of behaviour proofs for much code at all, but it's quite possible it'll rise enough that we want to do things like prove that indices are in bounds, at compile time, so vector accesses can skip checks without compromising safety.

crote 3 hours ago|||

I fear it'll just move the problem one layer up. Sure, you've now proven that the code matches the specification - but how do you ensure the specification is watertight?

jopsen 1 hour ago||

The specification doesn't have to be.

But yeah, writing specs is usually harder than reviewing the code 4 times :)

walnut_water 14 minutes ago||

It kinda does.

See WPA2 KRACK, you could've had a formally verified WPA2 implementation and it still would've been exploitable because the flaw was the specification itself.

rakel_rakel 1 hour ago||

I read every piece like this one as: Money is moving in the vulnerability space now, when as before the LLM hype incentivized that, your best bet was that someone skilled enough would accept living with the financial insecurity of being a gig worker to hopefully stumble upon your projects bug bounty program. Is the bet here is that the hype lasts, and that people willingly will keeping on paying Dario to be able to contribute?

> But give it 1-3 months and the open models will catch up.

I wish that this would stopped being thrown around, what is this timeline based on? How good is your open model from between March and May?

Also, having read "Gödel, Escher, Bach" I know that the hare never catches up with the turtle.

david_shaw 5 hours ago||

At risk of quoting too much of the article, it opens with this:

> A requirement for staying sane while working in public as an open source maintainer is realizing that every issue, PR, and piece of feedback is a present, not an obligation. You can accept it, ignore it, and use it partially or not at all.

> Except…

> For years, as lead of the Go Security team at the time, I’ve told new team members that it doesn’t apply to vulnerability reports. No, vulnerability reports are special. Security researchers are doing us a favor by reporting things confidentially instead of doing full disclosure, so we owe them something, which is not true of regular issues opened on the issue tracker.

[...]

> It’s 2026 and none of the premises are true anymore.

I respectfully disagree.

The premise is absolutely still true: if someone discovers a critical, exploitable vulnerability in your software, the impact and tradeoffs are exactly the same as they were before LLMs started finding bugs. There are just more of them now, so they're easier to come by.

But that won't last forever, either. As LLMs find increasingly difficult-to-find vulnerabilities, there will be fewer of them to report. This is just chugging through the backlog.

All of that said, I don't think finding vulnerabilities has really been the difficult security problem for most companies (or open source projects). The difficult problem is dedicating resources to fixing those vulnerabilities instead of building software, products, and/or infrastructure that people want. That problem is absolutely still here today, but I'm optimistic that agentic security developers will be able to take the burden off of development teams in the near future.

For tokens, of course.

CJefferson 4 hours ago||

The problem is there used to be a fairly high correlation between ‘security report’ and ‘real vulnerability’. Not perfect but good enough. Now the two are almost entirely disconnected.

appplication 4 hours ago|||

> But that won't last forever, either. As LLMs find increasingly difficult-to-find vulnerabilities, there will be fewer of them to report. This is just chugging through the backlog.

I think your logic is partly correct but the fact that the same LLMs are allowing an exponential increase in insecure code generated is a counterbalancing point. I do not think this phenomena will slow down.

sneak 3 hours ago||

Nah, those same LLMs, if prompted correctly, will be able to do an audit pass and a fix pass on that LLM-generated code. It’s a tooling issue that will get fixed in time.

shakna 3 hours ago|||

> But that won't last forever, either. As LLMs find increasingly difficult-to-find vulnerabilities, there will be fewer of them to report.

That is not my experience at all. People will continue to high-volume spam intended behaviour as if it is a bug.

There will be fewer reports that matter as you fix things - but the volume of reports will either stay steady or go up. Making it harder to even notice the ones that matter.

jcgrillo 3 hours ago||

The problem always existed, but nobody amassed a sufficiently large army of trolls to exploit it until now. So it wasn't a priority to solve it before, but now it is. We're going to have to learn to differentiate reports that matter from those that don't. Classifying reports might actually be something you could productively use an LLM for..

shakna 2 hours ago||

When we solved this problem for email... We just dumped everything similar to untrusted into the rubbish bin. Important things can vanish too, and that's an acceptable price.

cpuguy83 4 hours ago||

It's not (just) more of them, it's the same ones reported by multiple people.

I think the point is those issues are now easily discoverable and are nearly public because of it.

bawolff 5 hours ago||

There are some problems with incentives in the vuln report space. People report trivial vulns and expect the same treatment as people reporting critical vulns. But this isn't new with AI. Look at all the ReDos vulns in npm ecosystem. Its questionable if its a vuln in general but half of them aren't even triggerable.

dirkc 29 minutes ago||

These two bits stand out to me:

> The security researchers are not special, the insight and confidentiality are

> The bottleneck now is not finding potential issues but assessing which ones are real. Unless there’s already a trust relationship, external researchers can’t meaningfully contribute

My take-away from this is that the researchers were special all along and you should probably be building a trust relationship with them.

Despite what I want to believe about tech being a meritocracy, the reality is that trust plays an extremely important role and without it we risk a collapse of our open source software ecosystem.

One of my biggest criticisms of AI is the trust vacuum within which it operates

maxignol 30 minutes ago||

In the end, sorting prs and vulnerabilities has been the same for open source maintainers. How about adding a credibility score to every github account ? Couldn’t that cut sorting times ?

fastball 4 hours ago|

They weren't special even before LLMs. Drive-by script-kiddies would run some basic scripts against your platform and send generally-not-actually-a-vulnerability reports, claiming that these were big problems, and requesting to be paid bug bounties.

More comments...