Posted by voctor 5 hours ago
This is the biggest takeaway for me for AI. It's not even that nobody wants to do these things, its that by the time you finish your tasks, you have no time to do these things, because your manage / scrum master / powers that be want you to work on the next task.
Note aside, OpenJS executive director mentioned it's ok to use AI assistance on Node.js contributions:
I checked with legal and the foundation is fine with the DCO on AI-assisted contributions. We’ll work on getting this documented.
[1]: https://github.com/nodejs/node/pull/61478#issuecomment-40772...It is great to have a legal perspective on compliance of LLM generated code with DCO terms, and I feel safer knowing that at least it doesn't expose Node.js to legal risk. However it doesn't address the well known unresolved ethical concerns over the sourcing of the code produced by LLM tooling.
It's not an AI issue. Node.js itself is lots of legacy code and many projects depend on that code. When Deno and Bun were in early development, AI wasn't involved.
Yes, you can speed up the development a bit but it will never reach the quality of newer runtimes.
It's like comparing C to C++. Those languages are from different eras (relatively to each other).
Is it slop if it is carefully calculated? I tire of hearing people use slop to mean anything AI, even when it is carefully reviewed.
While the large code changes were maintained, they were often split up into a set of semantically meaningful commits for purposes of review and maintenance.
With AI blowing up the line counts on PRs, it's a skill set that more developers need to mature. It's good for their own review to take the mass changes, ask themselves how would they want to systematically review it in parts, then split the PR up into meaningful commits: e.g. interfaces, docs, subsets of changed implementations, etc.
Like, why on earth would I spent hours reviewing your PR that you/Claude took 5 minutes to write? I couldn't care less if it improves (best case scenario) my open source codebase, I simply don't enjoy the imbalance.
Well, the process you’re describing is mature and intentionally slows things down. The LLM push has almost the opposite philosophy. Everyone talks about going faster and no one believes it is about higher quality.
If there is some bug that slips by review, having the PR broken down semantically allows quicker analysis and recovery later for one case. Even if you have AI reviewing new Node.js releases for if you want to take in the new version - the commit log will be more analyzable by the AI with semantic commits.
Treating the code as throwaway is valid in a few small contexts, but that is not the case for PRs going into maintained projects like Node.js.
The fact is, it's useful as a tool, but you still should review what's going on/in. That isn't always easy though, and I get that. I've been working on a TS/JS driver for MS-SQL so I can use some features not in other libraries, mostly bridging a Rust driver (first Tiberious, then mssql-client), the clean abstraction made the switch pretty quick... a fairly thorough test suite for Deno/Node/Bun kapt the sanity in check. Rust C-style library with FFI access in TS/JS server environment.
My hardest part, is actually having to setup a Windows Server to test the passswordless auth path (basically a connection string with integrated windows auth). I've got about 80 hours of real time into this project so far. And I'll probably be doing 2 followups.. one with be a generic ODBC adapter with a similar set of interfaces. And a final third adapter that will privide the same methods, but using the native SQLite underneath but smothing over the differences.
I'm leveraging using/dispose (async) instead of explicit close/rollback patterns, similar to .Net as well as Dapper-like methods for "Typed" results, though no actual type validation... I'd considered trying to adapt Zod to check at least the first record or all records, and may still add the option.
All said though, I wouldn't have been able to do so much with so relatively little time without the use of AI. You don't have to sacrifice quality to gain efficiency with AI, but you do need to take the time to do it.
> Everyone talks about going faster and no one believes it is about higher quality.
Go Fast And Break Things was considered a virtue in the JavaScript community long before LLMs became widely available.On a more serious note, I think that this will be thoroughly reviewed before it gets merged and Node has an entire security team that overviews these.
Oh I'd use an llm to generate large amounts of feedback and request changes!
If submitter picks (a) they assert that they wrote the code themselves and have right to submit it under project's license. If (b) the code was taken from another place with clear license terms compatible with the project's license. If (c) contribution was written by someone else who asserted (a) or (b) and is submitted without changes.
Since LLM generated output is based on public code, but lacks attribution and the license of the original it is not possible to pick (b). (a) and (c) cannot be picked based on the submitter disclaimer in the PR body.
(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or
If there isn't, then (b) works fine, the code is taken from the LLM with no preexisting license. And it would be very strange if a mix of (a) and (b) is a problem; almost any (b) code will need some (a) code to adapt it.
Whether AI output can fall under copyright at all is still up for debate - with some early rulings indicating that the fact that you prompted the AI does not automatically grant you authorship.
Even if it does, it hasn't been settled yet what the impact of your AI having been trained on copyrighted material is on its output. You can make a not-completely-unreasonable argument that AI inference output is a derivative work of AI training input.
Fact is, the matter isn't settled yet, which means any open-source project should assume the worst possible outcome - which in practice means a massive AI-generated PR like this should be treated like a nuke which could go off at any moment.
1. Copyright cannot be assigned to an AI agent.
2. Copyrighted works require human creativity to be applied in order to be copyrighted.
For point 2 this would apply to times were AI one shots a generic prompt. But for these large PRs where multiple prompts are used and a human has decided what the design should be and how the API should look you get the human creativity required for copyright.
In regards to being a derivative work I think it would be hard to argue that an LLM is copying or modifying an existing original work. Even if it came up with an exact duplicate of a piece of code it would be hard to prove that it was a copy and not an independent recreation from scratch.
>the worst possible outcome
The worst possible outcome is they get sued and Anthropic defends them from the copyright infringement claim due to Anthopic's indemnity clause when using Claude Code.
Also the commercial version is limited to “…Customer and its personnel, successors, and assigns…”. I am very much not a lawyer and couldn’t find definitions of these in the agreement but I am not sure how transferable this indemnity would be to an open source project.
Just my opinion, probably not a popular one. But I will be avoiding an upgrade to Node.js after 24.14 for a while if this is becoming an acceptable precedent.
I like the idea of it mocking the file system for tests, but I feel like that should probably be part of the test suite, not Node.
The example towards the end that stores data in a sqlite provider and then saves it as a JSON file is mind-boggling to me. Especially for a system that's supposed to be about not saving to the disk. Perhaps it's just a bad example, but I'm really trying to figure out how this isn't just adding complexity.
node -e "new Function('console.log(\"hi\")')()"
or more to the point node -e "fetch('https://unpkg.com/cowsay/build/cowsay.umd.js').then((r) => r.text()).then(c => new Function(c + 'console.log(exports.say({ text: \"like this\"}))')())"
that one is particularly bad, because umd messes with the global object - so this works node -e "fetch('https://unpkg.com/cowsay/build/cowsay.umd.js').then((r) => r.text()).then(c => new Function(c)()).then(() => console.log(exports.say({ text: 'oh no'})))"I had to laugh, because the post you're replying to STRONGLY reminds me of this story, https://news.ycombinator.com/item?id=31778490 , in which some people on the GNOME project objected to thumbnails in the file-open dialog box because it might be a "Security issue" (even though thumbnails were available in the normal file browser, something those commenters probably should have known about, but didn't, but they just had to chime in anyway).
Look, most of us realized around 2004 or so that if you had a choice between Norton and the virus you would pick the virus. In the Windows world we standardized around Defender because there is some bound on how much Defender degrades the performance of your machine which was not the case with competitive antivirus software.
I've done a few projects which involved getting container file formats like ZIP and PDF (e.g. you know it's a graph of resources in which some of those resources are containers that contain more resources, right?) and now that I think of it you ought to be able to virus scan ZIP files quickly and intelligently but the whole problem with the antivirus industry is that nobody ever considers the cost.
Oh, wait...
See https://pnpm.io/motivation
Also, while popularity isn't necessarily a great indicator of quality, a quick comparison shows that the community has decided on pnpm:
Firstly - with yarn pnp zero-installs, you don't have to run an `install` every time you switch branch, just in case a dep changed. So much dev time is wasted due to this.
Secondly - "it worked on my machine" is eliminated. CI and deploy use the exact same files - this is particularly important for deeply nested range satisfied dependencies.
Thirdly - packages committed to the repo allows for meaningful retrospectives and automated security reviews. When working in ops, packages changing is hell.
All of this is facilitated by the zip files that the comment you replied to was discussing, that you tangented away from.
The graph you have linked is fundamentally odd. Firstly - there is no good explanation of what it is actually showing. I've had claude spin on it and it reckons its npm download counts. This leads to it being a completely flawed graph! Yarn berry is typically installed either via corepack or bootstrapped via package.json and the system yarn binary. Yarn even saves itself into your repo. pnpm is never (I believe) bundled with the system node, wheras yarn and npm typically are.
Your graph doesn't show what you claim it does.
Combined with a hackable IDE like Atom (Pulsar) made with the same tech it’s a pretty great dev exp for web devs
https://web.archive.org/web/20161003115800/https://blog.mozi...
There's Docker, OverlayFS, FUSE, ZFS or Btrfs snapshots?
Do you not trust your OS to do this correctly, or do you think you can do better?
A lot of this stuff existed 5, 10, 15 years ago...
Somehow there's been a trend for every effing program to grow and absorb the features and responsibilities of every other program.
Actually, I have a brilliant idea, what if we used nodejs, and added html display capabilities, and browser features? After all Cursor has already proven you can vibecode a browser, why not just do it?
I'm just tired at this point
You can’t import or require() a module
that only exists in memory.
You can convert it into a data url and import that, can't you?I do see some original benefits to a VFS though, bad application decisions aside, but they are exceedingly minor.
As an aside I think JavaScript would benefit from an in-memory database. This would be more of language enhancement than a Node.js enhancement. Imagine the extended application capabilities of an object/array store native to the language that takes queries using JS logic to return one or more objects/records. No SQL language and no third party databases for stuff that you don't want to keep in offline storage on a disk.
> I think JavaScript would benefit from an in-memory database.
That database would probably look a lot like a JSON object. What are you suggesting, that a global JSON object does not solve?The more structures you have in a given application and the larger those structures become in their schemas the more valuable a uniform storage and retrieval solution becomes.
isn't that just global state, or do you mean you want that to be persistent?