Top
Best
New

Posted by r00k 5 days ago

Formatting a 25M-line codebase overnight(stripe.dev)
212 points | 107 commentspage 2
ryanisnan 5 days ago|
Cool story. The treat at the end was fun as well, thank you!
hokkos 5 days ago||
Now it makes me wonder, are those 45M LoC are untyped ?
c3ab8ff137 5 days ago||
No, Stripe has its own Ruby typechecker - https://sorbet.org/
m12k 5 days ago||
https://brandur.org/nanoglyphs/015-ruby-typing#ruby-typing
hiroto_lemon 4 days ago||
[dead]
failure_arch 5 days ago||
[dead]
exsol 5 days ago||
[dead]
andrewstuart 5 days ago||
[flagged]
mbStavola 5 days ago||
Considering that it's been doing so successfully at volume for just over 15 years, I think their language choice was fine.
sixo 5 days ago|||
This ought to change your mind about Ruby!
skinfaxi 5 days ago|||
Why is that terrifying?
mikedelago 5 days ago|||
Some folks don't like shipping
fantasizr 5 days ago||||
ive yet to see a compelling elitist programming language opinion. especially when used at big successful companies. these companies don't function in spite of their technology choices.
NetOpWibby 5 days ago|||
The only one that worked on me wasn't even elitist in its framing.

Try TypeScript! It makes your JavaScript better!

That was enough for me.

lstodd 5 days ago|||
> these companies don't function in spite of their technology choices.

shows you never worked at "big succesful companies".

Jtsummers 5 days ago|||
It's not particularly terrifying. Some people really just don't like Ruby.
sikozu 5 days ago|||
The systems have to be written in some kind of programming language, and I think Ruby is a perfectly fine choice.
Imustaskforhelp 5 days ago||
Not denying that Ruby is a perfectly fine choice but within the article itself it says that Stripe runs the world's largest Ruby codebase so certainly it might be testing the constraints of the language.

The thing I am interested is that I don't suppose that Stripe always had these many LOC's and so I would be curious to know if at any point as the codebase was increasing, were they looking at other new languages which were coming like golang or rust which was more suited for their work or not and what were there decisions/thinking process to continue using ruby.

clintonb 5 days ago|||
LOC doesn’t have much to do with the “constraints of the language”.

Stripe has dabbled in Golang. There is also a growing Java monorepo.

throwaway041207 5 days ago|||
Stripe uses Sorbet which, in my experience, increases LOC.
sunrunner 5 days ago|||
Things can always be worse. It could be PHP, for example.
burnte 5 days ago||
Facebook runs in it, so I think the language itself is probably a fine choice.
Twirrim 5 days ago||
It's almost like other factors than language choice are more important :)
msla 5 days ago|||
If you think that's terrifying, imagine all of the essential code written in COBOL and FORTRAN.

Skippy the Intern, now retired these thirty years...

semiquaver 5 days ago|||
I’d hardly call Sorbet Ruby :)
benbristow 5 days ago||
[dead]
CrzyLngPwd 5 days ago||
Surely, it no longer needs to be human-readable, and the era of write-only code is finally upon us with the dawn of AI writing our mealtickets.

Why bother formatting 25m lines of slop, and why is AI wasting tokens on making code look human-readable anyway?

sgc 5 days ago|
Every LLM I have ever asked about this says they perform better when they receive pretty-printed code because it is easier to see structure and priorities. It has been an almost universal recommendation for me, and it makes sense since LLMs are just mimicking human expression.
CrzyLngPwd 2 days ago|||
That doesn't make sense.

A compiler doesn't need pretty code to compile; in my tests, when I ask an LLM to deobfuscate code, it doesn't skip a beat.

throawayonthe 4 days ago|||
you asked the llm? i'm confused

you do understand it can't "know" how it performs right?

sgc 4 days ago||
You actually think that LLMs are not fed docs on how they work in order to help users interact with them better? Asking an LLM how to use it is based on the reasonable presumption that the company making it will prioritize making it useful for users and work on programming it with its own best practices.

Again, it makes perfect sense as well based on how they are trained in the first place. Look at how they tokenize whitespace and you will see why it's useful. Each number of repeating white spaces gets a unique token (so 2 whitespaces = token1, 3 whitespaces = token2) - so it actually does make a very clear reinforcing hierarchy readily available. And we all know if there is anything an LLM needs, it is reinforcement of important points.

throwatdem12311 5 days ago||
What is even the point of formatting code anymore.
voidUpdate 4 days ago||
To make it look nice and readable
throwatdem12311 4 days ago||
For an agent?
voidUpdate 4 days ago||
For you, the person reading and writing the code
throwatdem12311 4 days ago||
People are still reading the code?
Twirrim 1 day ago|||
Why are you not reading the code?
voidUpdate 3 days ago|||
I do... I guess people not reading their own code is why products are so buggy and crap these days
throawayonthe 4 days ago|||
clean diffs for one
stefantalpalaru 4 days ago||
[dead]
cadamsdotcom 5 days ago|
An insight about code is that compared to the scale we operate on data, code as text is tiny. Instantaneous git operations and “run this tool over all the code” are the norm even while we wait for LLMs to stream their tokens to stream back so tool calls can operate on it.

That insight might seem obvious - but if you stay cognizant of it as you work, you can invent some pretty amazing tooling for yourself & your team.