Formatting code should be unnecessary

Posted by MaxLeiter 5 days ago

Formatting code should be unnecessary(maxleiter.com)

345 points | 469 comments

automatoney 5 days ago|

I've never understood why people care so much about the linter settings. It's so obviously bikeshedding, just make a choice, run the linter automatically and be done with it. I'm too busy doing actual software engineering to care about where exactly everything goes - I promise after a week you'll just get used to whatever format your team lands on.

AdieuToLogic 5 days ago||

> I've never understood why people care so much about the linter settings.

Source code formatting programs are not the same as lint[0] programs. The former rewrites source code files such that the output is conformant with a set of layout rules without altering existing logic. The latter is a category of idempotent source code analysis programs typically used to identify potential implementation errors within otherwise valid constructs.

Some language tools support both formatting and source code analysis, but this is an implementation detail.

0 - https://en.wikipedia.org/wiki/Lint_(software)

ryandrake 5 days ago|||

Thank you! I am almost going out of my mind reading this ~200 comment thread with everyone just casually saying "linter" when they mean "formatter". Do people really not distinguish between these two very different programs?

Narushia 4 days ago|||

I guess the line might feel fuzzy to some people, since nowadays many tools bundle both linting and formatting. And with modern IDE integrations, you might not even run them explicitly — the editor just does both automatically in one go.

beaugunderson 4 days ago||||

many people use the prettier plugin for eslint, for example, so in that case they're one and the same!

trallnag 5 days ago|||

Lately I've switched to the terms "check" and "fix" because more and more formatters (at least in my bubble of Python and Go) are incorporating fixes. So not just rearranging code and maybe adding a comma here and there.

stavros 5 days ago||||

Right, but it's obvious they meant "formatter".

_mu 5 days ago||

Why build understanding when you could be pedantic?

kristopolous 5 days ago||||

Formatters, if you want to be specific, are even worse.

They slyly add git noise and pollute your audit trails by just going through and moving shit around whenever you save a file.

And sometimes, they actually insert bugs - string formatting errors are my favorite example.

It's for people who think good code is a about adhering to aesthetic ideologies instead of making things documented and accountable.

This is most noticeable in open source contributions. Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

You think I accept that?

It's not a good idea

zahlman 5 days ago|||

> just going through and moving shit around whenever you save a file.

This only happens because the file doesn't already adhere to the rules it's implementing. These are normally highly configurable, and once your code complies to a standard, the tool prevents future code from pulling you away from that standard.

> And sometimes, they actually insert bugs - string formatting errors are my favorite example.

Do you have a concrete example?

> Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

Is your existing code formatting at least consistent?

> You think I accept that?

This is a social issue rather than a technical one. You can tell people in your development readme to use specific style rules, or even a project-wide precommit hook. If your own code is formatted with one of these tools, you can even (to my understanding) set up automated checks on GitHub's side.

But of course you are free to reject any PR you want.

kristopolous 4 days ago||

I don't keep examples around to defend my stance on this, sorry.

I left out my largest critique - spacing is semantic both for the compiler and the human.

Often I police the whitespace very thoughtfully usually in code that also requires long comments for clarity.

I care deeply about maintainability and legibility of code and try to consider future human readers everywhere.

Then the formatter says "haha, fuck that!"

That's my biggest personal gripe with it. It's consistency over clarity, conformity over craft.

This all depends on what kind of code you're writing. Standard backend crud code in python with sqlalchemy? ok, pydantic with linters and formatters. But that should be written by llms these days anyways, if you're still doing it by hand you're doing it wrong.

Honestly the jobs I sign up for demand a kind of care - mostly experimental and frontier work, so I am really frustrated when I'm prevented from exercising my professional judgement and doing what I think is best due to some bureaucratic red tape.

I don't want the Wild West, I want disciplined senior engineers using professional consideration and judgement and not a bunch of onerous restricting tools assuming I don't know what I'm doing or why I'm doing it

The language designers, compiler engineers, they are the actually competent people in the room and if they allowed for the flexibility I'll side with them over some hacked out set of rewriting regexs by some kid vibe coding on GitHub

bogomog 4 days ago||

+1 to this. Why throw away an extra dimension of expression available by choices in formatting in the name of consistency? I find it helpful to use blank lines to separate code into "paragraphs". A complex conditional may be made more readable by extra space in some places an not others. A series of single line if statements might be clearer formatted lined up vertically, like a table.

thfuran 5 days ago||||

You have it entirely backwards. Enforcing a consistent format is useful precisely because it avoids pointless git noise from different people changing formatting differently as they go.

rcxdude 5 days ago||||

Running random formatters on random subsets of your code is not a good idea. If you want code in a repo to be formatted a certain way, you need to have one set of settings and enforce it, and yeah, reject anything that just has spurious formatting changes that someone else has run.

mvieira38 5 days ago||||

> This is most noticeable in open source contributions. Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

This wouldn't happen nearly as much if you had a defined set of formatting rules plugged into CI instead of chaos

bluGill 5 days ago||

If the rule is not enforced in ci it isn't a rule. I've made that a mantra for a long time now and it helps. For a while I did verify formatting in ci but eventually we decided that formatiing wasn't as important as getting builds done fast. We still run other test and linters in ci and if they break fix the build but formatting isn't really that imbortant so we don't care. Yes our formatting is somewhat a mess but it isn't really that bad even after a decade of agreeing not to care.

misiek08 5 days ago||||

I’ve seen multiple repos with pre-hook and just CI running formatter on _modified_ code only. Those repos were the cleanest to date.

jchw 5 days ago||||

Formatters breaking code is not something that happens in all language ecosystems; I think it's mostly a C++ and occasionally JS issue, but for gofmt and many other formatters just don't break code. It's also not really that common anyways.

You can solve the Git noise issue by enforcing formatting in CI and keeping formatter configuration in repo. This is what most high quality open source projects will do. The purpose of this is not about "adhering to aesthetic ideologies", it's about not bothering people with the minutiae of yet another pointless set of formatting conventions. Most developers couldn't give a shit less where you think braces should go, or whether you like tabs or spaces, or whatever else, they care about more important things like data structures and writing more correct code. Having auto formatting enables them to effortlessly follow project norms without needing to, for every single repo they work in, carefully try to adhere to the documented formatting (which usually winds up being inconsistent eventually anyways, in projects without auto formatting, because humans are fallible.)

The reason why people submit code with a huge formatting diff is usually because your project didn't ship a formatter manifest but their editor is configured to format on save. That's because probably most of the projects people work on now do actually use some form of automatic formatting, be it clang-format, gofmt, prettier, black, etc. so it winds up being necessary to special case your project to not try to run a formatter. It's still a beginner's mistake to actually commit and PR a huge reformatting, but it definitely happens by accident to even experienced devs when working on projects that have weird manual formatting.

galangalalgol 5 days ago||||

Avoiding that situation is what I like about formatters. As long as the language has an obvious standard like rust or go.

kristopolous 4 days ago||||

I'll sound off more on this.

Formatters also value consistency over clarity.

I break formatting all the time for the sake of clarity.

Sometimes my comments are paragraphs long with citations and things are carefully broken down with interstitial comments and references and then the formatter fucks it all up and the linter says "wah this oblivious pedant rule isn't followed"

The problem is it doesn't treat me like an adult And I'm not in this industry for dumb Nanny tools that scold me because they don't understand things

craftkiller 5 days ago||||

> Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

The reformatting tools should be CI-enforced so you'll only end up with sudden massive changes like this once when you start using auto-formatters.

Regardless, tell your teammates to separate out formatting changes vs logic changes into separate commits (preferably separate PRs). Since they're auto-formatters it wouldn't even be any additional work, just:

  git fetch origin
  git checkout origin/main
  git checkout -b formatting
  ./run_the_autoformatter.bash
  git commit -a -m "Ran the auto-formatter, which should have been enforced by the CI."
  git push -u origin formatting

shepherdjerred 4 days ago||

And then setup git-ignore-revs

https://github.com/orgs/community/discussions/5033

craftkiller 4 days ago||

Awesome, TIL about this feature. Thanks!

WalterBright 5 days ago||||

What I do is make two separate PRs - one for the coding change, the other for reformatting only.

johnnypangs 5 days ago||||

I’ve used this before, it helps when you format the entire repo and remove the one commit from the history https://docs.github.com/en/repositories/working-with-files/u...

triknomeister 5 days ago|||

formatting on git diffs is a concept which should be embraced.

kristopolous 4 days ago||

Can you go more into this?

closeparen 5 days ago||||

Before picking up Go with its formatter and format-on-save norm, I mostly worked in contexts where a program would scan your source code and complain about style violations, but not actually fix them. We called those linters.

kolme 5 days ago|||

Yes, you are technically correct and yet absolutely irrelevant to the conversation, just adding meaningless noise.

Also, there are many linters that also do formatting, blurring the "line" you're pointing at.

vidarh 5 days ago|||

Because I spent the vast majority of the time I spent on code reading it, and the layout matters to me in terms of how much time it takes for me to read code.

Yes, I can get used to other layouts, but that by no means means all layouts are equal to me in terms of how readable they are, and how well things stand out when they should, or blend in when they should.

I recognise this isn't the case for everyone - some people read code beginning to end and it doesn't matter how its laid out. But I pattern match visually, and read fragments based on layout, and I remember code based on visual patterns.

Ironically, because I have aphantasia, and don't visualise things with my "minds eye", but I still remember things by visual appearance and spatial cues better than by text.

godshatter 4 days ago||

That's interesting. I also have aphantasia and I seem to be the only one around where I work that cares one bit about visual presentation. I suspect it's because people with aphantasia have to rely on waiting for visual recognition to happen so often that we get good at it and rely on it more. I'm also known for figuring out a problem and jumping to the exact file and area in the code that's causing it immediately where others apparently have to read through the code again to find it. I just remember it's maybe 2/3 the way through the file and just past that big switch statement and has a one-liner comment above it and has a distinctive shape.

For me, indenting with tabs and aligning with spaces helps me find the code that I'm looking for as does adequate whitespace and a color syntax highlighting editor. Aligning things with distinct columns where it makes sense helps a lot, too.

vidarh 4 days ago||

Yeah, all of this matches my experience too.

socalgal2 5 days ago|||

some settings have advantages. For example, trailing commas on tables

    [
      'apple',
      'banana',
      'orange',
    ]

has an advantage over

    [
      'apple',
      'banana',
      'orange'
    ]

Because adding a new line at the end of the table (1) requires editing 1 line, instead of 2 (2) makes the diffs in code review smaller and easier to read and review. So a bad choice makes my life harder. The same applies to local variable declarations.

Sorted lists (or sorted includes) is also something that makes my life easier. If they're not sorted then everyone adds their new things to the end, which means there are many times more merge conflicts. sorted doesn't mean there are zero but does mean there are less than "append to the end". So, just like an auto-formatter is there to save time, don't waste my time by not sorting where possible.

Also, my OCD hates inconsistency. So

    [1, 2, 3]
    {a, b, c}

Is ok and

    [ 1, 2, 3 ]
    [ a, b, c ]

Is ok but

    [1, 2, 3]
    { a, b, c }

Is not. I don't care which but pick ONE style, not two styles!

austin-cheney 5 days ago|||

Yes, everyone has personal opinions about code vanity. When this becomes a holy war I really start to question the maturity of people on the project. I find that people worry about trivial nonsense to mask their inability to address more valid concerns.

All that really matters is consistency. Let a team make some decisions and then just move forward.

socalgal2 4 days ago|||

> All that really matters is consistency

And this is my problem. My last example, the 2 styles are inconsistent. So when the guideline is "all that matters in consistency" then I take that at face value. You though apparently pull back and believe something more messy. Effectively "All that really matters is a consistently applied style even if that style itself is full of inconsistency"

The same applies to the trailing commas. With them, every line is consistent. Without, the last line is inconsistent with the other lines. So are we applying this rule "All that really matters is consistency" or only applying it sometimes? I would argue whatever heuristic made you say "All that really matters is consistency" should apply to both cases. Consistently apply a style guide and, the style guide itself should be consistent.

godshatter 4 days ago||

> Without, the last line is inconsistent with the other lines.

The last line without a comma imparts the information that this line is expected to be the last line in the list which is something you don't get from it with that trailing comma. It's a small thing, but I don't like putting in that last comma because I'm effectively misrepresenting things for "syntactic sugar".

2muchcoffeeman 5 days ago||||

Don’t bother making decisions. Steal a standard. Vote on it once if you want to be democratic. Done forever.

gorgoiler 5 days ago||

Democracy, strictly speaking, would be to periodically elect the most popular formatting policy once every sensible-time-period.

I’ve seen companies with such a large amount of developer churn that literally one person was left defending the status quo saying “we do X here, we voted on it once in 2019 and we’re not changing it just for new people”. 90% of the team were newcomers.

(The better teams I’ve worked on maintain a core set of leaders who are capable of building consensus through being very agreeable and smart. Gregarious Technocracy >> Popular Democracy!)

plusplusungood 4 days ago||

Ah, the infamous, but fictional, Monkey Ladder Experiment.

ParetoOptimal 5 days ago||||

> All that really matters is consistency. Let a team make some decisions and then just move forward.

Not so! Amount of tokens correlates to perceived code complexity to some. One example is how some people can't unsee or look past lisps parenthesis.

Another example is how some people get used to longDescriptiveVariableNames but others find that overwhelming (me for instance) when you have something like:

    userSignup = do
        let fullName = userFirstNameInput + userLastNameInput
            userName = take 1 userFirstNameInput + take 10 userLastNameInput
        saveToDB userName

Above isn't bad, but imagine variables named that verbosely used over and over, esp in same line.

Compare it to:

    userSignup = do
        let fullName = firstName + lastName
            userName = take 1 firstName + take 10 lastName
        saveToDB userName

The second example loses some information, but I'd argue it doesn't matter too much given the context one would typically have in a function named `userSignup`.

I've had codebases where consistency required naming all variables like `firstNameInputField` rather than just `firstName` and it made functions unreadable because it made the unimportant parts seem more important than they were simply by taking up more space.

parthdesai 5 days ago|||

It's one of my pet peeves when some senior engineers are bothered more by these coding semantics in a PR when there are bigger data model/code architectural issues, and don't call that out.

yes_man 5 days ago||||

The problem is when 2 people with same level of enthusiasm for linter rules but opposing views collide. If there’s nothing more impactful you could be solving and spending energy and time on than arguing those linter rules, then it’s time to question where the project is at and where is it going.

And if there is something more important, then instead of of micro-optimizing the rules when there is strong disagreement it’s probably best if one of the parties takes the high road and lives with it so you can all focus on what matters.

vbezhenar 5 days ago|||

I guess that's one reason why opinionated tools like prettier or gofmt are popular. They made all the choices for you, they don't have configurable knobs, so you just learn to live with it.

lsaferite 4 days ago|||

FWIW, there are formatting decisions that gofmt doesn't make for you, so it's not as simple as just using gofmt.

megamalloc 5 days ago|||

The bad thing about these sort of tools is when you work in a shop where multiple platforms are used for development and one of the platforms doesn't support the tool, or the tool fights with other tooling on that platform. You should for example never use pre-commit to enforce line ending style because git has brain dead defaults (which is to say, unless you have a .gitsettings file in your repo to prevent it, it will change line endings itself, fighting pre-commit). This just creates confusion and wasted time. In aid of what? So some anal so-and-so can get their way about code formatting as though it makes everyone else more productive to format code THEIR way. When in fact it makes others LESS productive as they fight "computer says no" format-nazi jobs in CI that don't even report what is "wrong" with the formatting and rely on tooling that they don't have installed to run locally.

Not to mention the overhead of running these worthless inefficient tools on every commit (even locally).

Tools like this just raise the debate from different opinions about formatting to different opinions about workflows. Workflows impact productivity a lot more than formatting.

bluGill 5 days ago|||

You should force them to choose from someone else's style. Don't let them tweak individual settings, choose a complete standard and apply it with both the thing they like and things they don't. A style is useful, the details do not matter that much.

rapind 5 days ago||||

Let this sink in though:

    [ 'apple'
    , 'banana'
    , 'orange'
    ]

maest 5 days ago|||

That makes prepending an element a special case.

ParetoOptimal 5 days ago||

It makes it easier to read though because the least important parts are most easily ignored. The reader can focus on the contents of the list.

3pt14159 5 days ago|||

I don't really know why we even need commas for lists of things. Just use the white space.

bogomog 4 days ago|||

I find it jarring compared to commas after the words, making the commas unnecessarily prominent.

JBiserkov 5 days ago||||

In Clojure, commas are treated as whitespace and are thus completely optional.

Nevermark 5 days ago|||

This is so clearly superior. Delimiters are prefixes.

But the scale of technical debt this insight has revealed is depressing.

citizenkeen 5 days ago||

Saying this is clearly superior means you don’t keep your lists sorted. A sorted list is as likely to add something to the beginning as the end, where this solution has the same problem.

Nevermark 5 days ago|||

I just reverse the sort order when that case happens.

setr 5 days ago|||

The only correct syntax/format

If only there existed a language designer intelligent enough to support it

maccard 5 days ago|||

You want yaml

    key:
      - a
      - b
      - c

MonkeyClub 5 days ago||||

Lisp:

crazygringo 5 days ago|||

Thank you for the humor!

I'm just suddenly slightly terrified someone's going to see this and think it's genuinely a good idea and make it part of the next popular scripting language, where lists are defined by starting commas or something :S

jrochkind1 5 days ago||||

You'll be annoyed to know that your last "not okay" style is what's considered standard in ruby (although the curly braces have different semantics, they are, well, either a hash or a code block (which is kind of annoying to me that they're used for two entirely different things) never a list/array).

zahlman 5 days ago||||

> Also, my OCD hates inconsistency

Mine hates trailing commas :)

More seriously, I don't like having lists like that in the code in the first place. I don't want multiple lines taken up for just constant values, and if it turns out to require maintenance then the data should be in a config file instead anyway.

whizzter 5 days ago|||

Rule of 3, write/change it once or twice (or seldomly enough with no possible negative impact) and it doesn't need any complexity. More than so.. yeah probably goes into a config.

Revisional_Sin 5 days ago|||

Constants in the code are easier to navigate to than config files.

muzani 5 days ago||||

I agree with you on all these points. If you were to argue the opposite point, I'd agree as well.

aleph_minus_one 5 days ago||||

> Because adding a new line at the end of the table (1) requires editing 1 line, instead of 2 (2) makes the diffs in code review smaller and easier to read and review.

This judgement is rather based on a strong personal opinion (which I don't claim to be wrong, but also not as god-given) on what is one, and what are two changes in the code:

- If you consider adding an additional item to the end of the list to be one code change, I agree that a trailing comma makes sense

- On the other hand, it is also a sensible judgment to consider this to be a code change of two lines:

1. an item (say 'peach') is added to the end of the list

2. 'orange' has been turned from the last element of the list to a non-last element of the list

If you are a proponent of the second interpretation, the version that you consider to be non-advantageous is the one that does make sense.

dghf 5 days ago|||

> 2. 'orange' has been turned from the last element of the list to a non-last element of the list

Then why not consider it four changes?

3. 'banana' has been turned from the last-but-one element of the list to the last-but-two element of the list

4. 'apple' has been turned from the last-but-two element of the list to the last-but-three element of the list

justincredible 5 days ago||

[dead]

Skeime 5 days ago||||

But the second interpretation only makes sense if the last item somehow deserves special treatment (over, say, the second-to-last item). Otherwise, you should similarly argue that the previous second-to-last item should also show up in the changes as it has now turned into the third-to-last item. (So maybe every item in the list should be preceded by as many spaces as are items before it and succeeded by as many commas as are items following it. Then, every change to the list will be a diff of the entire list.)

    first item,,,
     second item,,
      third item,
       fourth item

In my experience, special treatment for the last item is rarely warranted, so a trailing comma is a good default. If you want the last item to be special, put a comment on that line, saying that it should remain last. (Or better yet, find a better representation of your data that does not require this at all.)

aleph_minus_one 5 days ago||

> But the second interpretation only makes sense if the last item somehow deserves special treatment (over, say, the second-to-last item).

There do exist reasons why this can make sense:

- In an Algebraic Data Type implementation of a non-empty list, the last symbol is a different type constructor than the one to append an item to the front of an existing non-empty list (similarly how for an Algebraic Data Type implementation of an arbitrary list, the type constructor for an initial empty list is "special").

- In a single-linked list implementation, sometimes (depending on the implementation) the terminal element of the list is handled differently.

---

By the way: at work, because adding parameters at the beginning of a (parameter) list of a function is "special" (because in the code for many functions the first parameters serve a very special purpose), but adding some additional parameter at the end is not, we commonly use parameter lists formatted like

    'foo'
  , 'bar1'
  , 'bar2'
  , 'blub'

hananova 5 days ago||||

Meanwhile, I know and understand the reasons for trailing commas, but I find them incredibly ugly so I always strip them out.

sarchertech 5 days ago||

Can’t strip them out if the compiler requires them.

huflungdung 5 days ago|||

That isn’t ocd.

psychoslave 5 days ago|||

I don't care that much about the specific retained options (though my own gusts of the day are obviously the best taste ever in the whole existence of universe) but having a common linter setting to prevent the noise in every damn PR is a must have.

Yes both git and all these PL are actually damn stupid to take lines at face value instead of something more elegant like Ada does. In my 20+ year career I've been proposed only once a project that involved Ada.

It's hard to come with something elegant and efficient. It's even harder to make it reach top tiers global presence, all the more when the ecological niche is already filled with good enough stuff.

jupp0r 5 days ago|||

I generally agree, but max line length being so high you have to horizontally scroll while reading code is very detrimental to productivity.

elevation 5 days ago|||

Formatters eliminating long lines is a pet peeve of mine.

About once every other project, some portion of the source benefits from source code being arranged in a tabular format. Long lines which are juxtaposed help make dissimilar values stand out. The following table is not unlike code I have written:

  setup_spi(&adc,    mode=SPI_01, rate=15, cs_control=CS_MUXED,  cs=0x01);
  setup_spi(&eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED,  cs=0x02);
  setup_spi(&mram,   mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

Even if we add 4-5 more operational parameters, I find this arrangement much more readable than the short-line equivalent:

  setup_spi(&adc,
      mode=SPI_01,
      rate=15,
      cs_control=CS_MUXED,
      cs=0x01);
  setup_spi(&eeprom,
      mode=SPI_10,
      rate=13,
      cs_control=CS_MUXED,
      cs=0x02);
  setup_spi(&mram,
      mode=SPI_10,
      rate=50,
      cs_control=CS_DIRECT,
      cs=0x08);

Or worse, the formatter may keep the long lines but normalize the spaces, ruining the tabular alignment:

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED, cs=0x01);
  setup_spi(&som_eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED, cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

Sometimes a neat, human-maintained block of 200 character lines brings order to chaos, even if you have to scroll a little.

sn0wleppard 5 days ago|||

The worst is when you have lines in a similar pattern across your formatter's line length boundary and you end up with

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED, cs=0x01);
  setup_spi(&eeprom,
      mode=SPI_10,
      rate=13,
      cs_control=CS_MUXED,
      cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

crazygringo 5 days ago||

I think with the Black formatter you can force the multiline version by adding a trailing comma to the arguments.

The pain point you describe is real, which is why that was intentionally added as a feature.

Of course it requires a language that allows trailing commas, and a formatter that uses that convention.

dvdkon 5 days ago||

A similar tip: As far as I can tell, clang-format doesn't reflow across comments, so to force a linebreak you can add a // end-of-line comment.

crazygringo 5 days ago||||

I get what you're saying, and used to think that way, but changed my mind because:

1) Horizontal scrolling sucks

2) Changing values easily requires manually realigning all the other rows, which is not productive developer time

3) When you make a change to one small value, git shows the whole line changing

And I ultimately concluded code files are not the place for aligned tabular data. If the data is small enough it belongs in a code file rather than a CSV you import then great, but bothering with alignment just isn't worth it. Just stick to the short-line equivalent. It's the easiest to edit and maintain, which is ultimately what matters most.

paddy_m 5 days ago||

This comes up in testing a lot. I want testing data included in test source files to look tabular. I want it to be indented such that I can spot order of magnitude differences.

a_e_k 5 days ago||||

Yes, so much this!

I've often wished that formatters had some threshold for similarity between adjacent lines. If some X% of the characters on the line match the character right above, then it might be tabular and it could do something to maintain the tabular layout.

Bonus points for it's able to do something like diff the adjacent lines to detect table-like layouts and figure out if something nudged a field or two out of alignment and then insert spaces to fix the table layout.

Cthulhu_ 5 days ago||

I believe some formatters have an option where you can specify a "do not reformat" block (or override formatting settings) via specific comments. As an exception, I'm okay with that. Most code (but I'm thinking business applications, not kernel drivers) benefits from default code formatting rules though.

And sometimes, if the code doesn't look good after automatic formatting, the code itself needs to be fixed. I'm specifically thinking about e.g. long or nested ternary statements; as soon as the auto formatter spreads it over multiple lines, you should probably refactor it.

a_e_k 5 days ago||

I'm used to things like `// clang-format off` and on pairs to bracket such blocks, and adding empty trailing `//` comments to prevent re-flowing, and I use them when I must.

This was more about lamenting the need for such things. Clang-format can already somewhat tabularize code by aligning equals signs in consecutive cases. I was just wishing it had an option to detect and align other kinds of code to make or keep it more table like. (Destroying table-like structuring being the main places I tend to disagree with its formatting.)

VBprogrammer 5 days ago||||

Those kind of tables improve readability right until someone hits a length constraint and had to either touch every line in order to fix the alignment, causing weird conflicts in VCS, or ignore the alignment and it's slow decay into a mess begins.

Cthulhu_ 5 days ago|||

It's not an either/or though. Tables are readable and this looks very much like tabular data. Length constraints should not be fixed if you have code like this, and it won't be "a slow decay into a mess" if escaping the line length rules is limited to data tables like these.

VBprogrammer 5 days ago||

By length constraint I meant that one of the fields grows longer than originally planned rather than bypassing the linter.

DemocracyFTW2 4 days ago|||

so you're basically saying "look this is neat and I like it, but since we cannot prevent some future chap come along and make a mess of it, let's stop this nonsense, now, and throw our hands up in the air—thoughts and prayers is what I say!"?

VBprogrammer 4 days ago||

At best I'd say it's ok to use it sparingly, in places where it really does make an improvement in readability. I've seen people use it just to align the right hand side of a list of assignments, even when there is no tabular nature to what they are assigning.

lambdaba 5 days ago||||

I agree, I'm very much against any line length constraint, it's arbitrary and word wrapping exists.

jaimebuelta 5 days ago||||

The first line should be readable enough, but in case it's longer than that, I way prefer the style of

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED,  
            cs=0x01);
  setup_spi(&eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED,  
            cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, 
            cs=0x08);

of there the short-line alternative presented.

I like short lines in general, as having a bunch of short lines (which tend to be the norm in code) and suddenly a very long line is terrible for readability. But all has exemptions. It's also very dependent on the programming language.

bryanrasmussen 5 days ago||||

People have already outlined all the reasons why the long line might be less than optimal, but I will note that really you are using formatting to do styling.

In a post-modern editor (by which I mean any modern editor that takes this kind of thing into consideration which I don't think any do yet) it should be possible for the editor to determine similarity between lines and achieve a tabular layout, perhaps also with styling for dissimilar values in cases where the table has a higher degree of similarity than the one above. Perhaps also with collapsing of tables with some indicator that what is collapsed is not just a sub-tree but a table.

vbezhenar 5 days ago||||

It is an obvious example where automatic formatter fails.

But are there more examples? May be it's not high price to pay. I'm using either second or third approach for my code and I never had much issues. Yes, first example is pretty, but it's not a huge deal for me.

account42 5 days ago||||

Another issue with fixed line lengths is that it requires tab stops to have a defined width instead of everyone being able to choose their desired indentation level in their editor config.

rerdavies 5 days ago|||

I think you have that backward. Allowing everyone to choose their desired indentation in their editor config is the issue. That's insane!

DonHopkins 5 days ago|||

Another issue with everyone being able to choose their desired indentation level in their editor config is unbounded line length.

growse 5 days ago||||

//nolint

bloak 5 days ago||

/* clang-format off */

someothherguyy 5 days ago||||

  setup_spi(
    &adc,
    mode=SPI_01,
    rate=15,
    cs_control=CS_MUXED,
    cs=0x01
  );
  setup_spi(
    &eeprom,
    mode=SPI_10,
    rate=13,
    cs_control=CS_MUXED,
    cs=0x02
  );
  setup_spi(
    &mram,
    mode=SPI_10,
    rate=50,
    cs_control=CS_DIRECT,
    cs=0x08
  );

ftfy

DonHopkins 5 days ago|||

This is good, and objectively better than letting the random unbounded length of the function name define and inflate and randomize the indentation. It also makes it easier to use long descriptive function names without fucking up the indentation.

  setup_spi(&adc,
            mode=SPI_01,
            rate=15,
            cs_control=CS_MUXED,
            cs=0x01
  );
  setup_spoo(&adc,
             mode=SPI_01,
             rate=15,
             cs_control=CS_MUXED,
             cs=0x01
  );
  setup_s(&adc,
          mode=SPI_01,
          rate=15,
          cs_control=CS_MUXED,
          cs=0x01
  );
  validate_and_register_spi_spoo_s(&adc,
                                   mode=SPI_01,
                                   rate=15,
                                   cs_control=CS_MUXED,
                                   cs=0x01
  );

DemocracyFTW2 4 days ago||

Here, fixed it for you:

    setup_spi(
      &adc,
      mode        = SPI_01,
      rate        = 15,
      cs_control  = CS_MUXED,
      cs          = 0x01 );
    setup_spoo(
      &adc,
      mode        = SPI_01,
      rate        = 15,
      cs_control  = CS_MUXED,
      cs          = 0x01 );
    setup_s(
      &adc,
      mode        = SPI_01,
      rate        = 15,
      cs_control  = CS_MUXED,
      cs          = 0x01 );
    validate_and_register_spi_spoo_s(
      &adc,
      mode        = SPI_01,
      rate        = 15,
      cs_control  = CS_MUXED,
      cs          = 0x01 );

Marazan 5 days ago|||

That is harder to read than the long line version.

However, it is the formatting I adopt when forced to bow down to line length formatters.

lenkite 5 days ago|||

Err..I find the short-line version easier to read. Esp if you need to horizontally scroll.

This is why a Big Dictator should just make a standard. Everyone who doesn't like the standard approach just gets used to it.

someothherguyy 5 days ago|||

to you, to me, it reads nicely, and thus the issue -- editors should have built in formatters that don't actually edit source code, but offer a view

thaumasiotes 5 days ago||

To me, that reads fine, but it has lost the property elevation wanted, which was that it's easy to compare the values assigned to any particular parameter across multiple calls. In your version you can only read one call at a time.

IlikeKitties 5 days ago||||

I'm suprised. I find the short-line version to be much better.

komali2 5 days ago|||

Devs have different pixel count screens. Your table wrapped for me. The short line equivalent looks best on my screen.

Thus 80 or perhaps 120 char line lengths!

account42 5 days ago|||

So fix your setup? Why should others with wider screens leave space on their screen empty for your sake?

Especially 80 characters is a ridiculously low limit that encourages people to name their variables and functions some abbreviated shit like mbstowcs instead of something more descriptive.

genericspammer 5 days ago|||

Do you guys never read code as side by side diffs in the browser?

komali2 5 days ago||

Never mind in a browser, this is how I review a ton of code, either in magit or lazygit or in multiple terminals.

saagarjha 5 days ago||||

I softwrap so I don't care about line length myself but I read code on a phone a lot so people who hardwrap at larger columns are a little more annoying

maratc 5 days ago||||

> Why should others with wider screens leave space on their screen empty for your sake?

Because "I" might be older or sight-impaired, and have "my" font at size 32, and it actually fills "my" (wider than yours) screen completely?

Would you advise me to "fix my eyes" too? I'd love to!

"Why should I accommodate others" is a terrible take.

rerdavies 5 days ago||

I would advise you to buy one of these: https://www.dell.com/en-ca/shop/dell-ultrasharp-49-curved-us...

80-column line lengths is a pretty severe ask.

komali2 5 days ago||||

My main machine is an ultrawide, but I usually have multiple files open, and text reads best top-down so I stack files side-by-side. If someone has like, a 240 character long line, that is annoying. My editor will soft wrap and indicate this in the fringe of course but it's still a little obnoxious.

80 is probably too low these days but it's nice for git commit header length at least.

DonHopkins 5 days ago||||

So haul your wide monitor around with your laptop, you mean? No.

Just use descriptive variable names, and break your lines up logically and consistently. They are not mutually exclusive, and your code will be much easier for you and other people to read and edit and maintain, and git diffs will be much more succinct and precise.

delusional 5 days ago|||

> So fix your setup? Why should others with wider screens leave space on their screen empty for your sake?

What a terrible attitude to have when working with other people.

"Oh, I'm the only one who writes Python? Fix your setup. why should I, who know python, not write it for your sake?"

"Oh, I'm the only one who speaks German? Fix your setup. Why should I, who know German, not speak it for your sake?"

How about doing it because your colleagues, who you presumably like collaborating with to reach a goal, asks you to?

account42 5 days ago|||

Yes, I don't think we should discourage people from using Python or German just because you don't want to learn those particular languages either.

Working together with others should not mean having to limit everyone to the lowest common denominator, especially when there are better options for helping those with limitations that don't impact everyone else.

balamatom 5 days ago|||

What do you do about the "oh, I'm the only one who cares about [???]? should I just fucking kill myself then?" Many such cases.

>How about doing it because your colleagues, who you presumably like collaborating with to reach a goal, asks you to?

If a someone wants me to do a certain thing in a certain way, they simply have to state it in terms of:

- some benefit they want to achieve

- some drawback they want to avoid

- as little as an acknowledged unexamined preference like "hey I personally feel more comfortable with approach X, how bout we try that instead"

I'm happy to learn from their perspective, and gladly go out of my way to accomodate them. Sometimes even against my better judgment, but hell, I still prefer to err on the side of being considerate. Just like you say, I like to work with people in terms of a shared goal, and just like you do, in every scenario I prefer to assume that's what's going on.

If, however, someone insists on certain approaches while never going deeper in their explanations than arbitrary non-falsifiable qualifiers such as "best practice", "modern", "clean", etc., then I know they haven't actually examined those choices that they now insist others should comply with. They're just parroting whatever version they imagine of industry-wide consensus describes their accidental comfort zone. And then boy do they hate my "make your setup assume less! it's the only way to be sure!". But no, I ain't reifying their meme instead of what I've seen work with my own two.

delusional 5 days ago||

> If, however, someone insists on certain approaches while never going deeper in their explanations than arbitrary non-falsifiable qualifiers such as "best practice", "modern", "clean"

You're moving the goalposts of this discussion. The guy I was responding to said "fix your setup" to another person saying "Your table wrapped for me. The short line equivalent looks best on my screen." That's a stated preference based on a benefit he'd like to achieve.

We are not discussing "best practice" type arguments here.

balamatom 5 days ago||

"Best practice" type arguments are the universal excuse for remaining inconsiderate of the fact that different people interact with code differently, but fair enough I guess

brettermeier 5 days ago|||

Living in the 80's XD

tsimionescu 5 days ago||||

I am at the opposite end. Having any line length constraints whatsoever seems like a massive waste of time every time I've seen it. Let the lines be as long as I need them, and accept that your colleagues will not be idiots. A guideline for newer colleagues is great, but auto-formatters messing with line lengths is a source of significant annoyance.

Cthulhu_ 5 days ago||

> auto-formatters messing with line lengths is a source of significant annoyance.

Unless they have been a thing since the start of a project; existing code should never be affected by formatters, that's unnecessary churn. If a formatter is introduced later on in a project (or a formatting rule changed), it should be applied to all code in one go and no new code accepted if it hasn't passed through the formatter.

I think nobody should have to think about code formatting, and no diff should contain "just" formatting changes unless there's also an updated formatting rule in there. But also, you should be able to escape the automatic formatting if there is a specific use case for it, like the data table mentioned earlier.

jitl 5 days ago||||

every editor can wrap text these days. good ones will even indent the wrapped text properly

giveita 5 days ago|||

Thats a slippery slope towards storing semantics and displaying locally preferred syntax ;)

Cthulhu_ 5 days ago|||

And that's fine, as long as whatever ends up in version control is standardized. Locally you can tweak your settings to have / have not word wrapping, 2-8 space indentation, etc.

But that's the core of this article, too; since then it's normalized to store the plain text source code in git and share it, but it mentions a code and formatting agnostic storage format, where it's down to people's editors (and diff tools, etc) to render the code. It's not actually unusual, since things like images are also unreadable if you look at their source code, but tools like Github will render them in a human digestable format.

jitl 5 days ago||||

I prefer storing plain text and displaying locally preferred syntax, to a degree.

With some expressions, like lookup tables or bit strings, hand wrapping and careful white space use is the difference between “understandable and intuitive” and “completely meaningless”. In JS world, `// prettier-ignore` above such an expression preserves it but ideally there’s a more universal way to express this.

NL807 5 days ago||||

And the bikeshedding has begun...

thfuran 5 days ago|||

Who’s going to be bikeshedding (about formatting) when everyone can individually configure their own formatting rules without affecting anyone else?

giveita 5 days ago||||

What's the nuclear reactor in this analogy?

pferde 5 days ago||

That the values could have been extracted to an array of structs, and iterated over in a small cycle that calls the function for each set of values.

virtue3 5 days ago|||

was going to say the same thing.

Boy that was fast.

bogomog 4 days ago||||

That's why Python should have gone all-in on significant spaces: tabs for blocks, spaces after tabs for line continuation

DemocracyFTW2 4 days ago||

Mixing spaces and tabs is a surefire way to ruin everything.

rightbyte 4 days ago|||

Is this a subtle pro-tab pinch?

rTX5CMRXIfFG 5 days ago||||

You still have to minimize the wrapping that happens, because wrapped lines of code tend to be continuous instead of being properly spaced so as to make its parts individually readable.

hulitu 5 days ago|||

> every editor can wrap text these days.

could. Yesterday notepad (win 10) just plainly refused.

jitl 5 days ago||

Windows is so weird

appellations 5 days ago||||

I forget there are people who don’t configure softwrap in their text editor.

Some languages (java) really need the extra horizontal space if you can afford it and aren’t too hard to read when softwrapped.

jghn 5 days ago||||

I’d agree with you except for the trend over the last 10 years or so to set limits back to the Stone Age. For a while there we seemed to be settling on somewhere around 150 characters and yet these days we’re back to the 80-100 range.

forrestthewoods 5 days ago|||

Define high? I think 120 is pretty reasonable. Maybe even as high as 140.

Log statements however I think have an effectively unbounded length. Nothing I hate more than a stupid linter turning a sprinkling of logs into 7 line monsters. cargo fmt is especially bad about this. It’s so bad.

skinner927 5 days ago|||

I still prefer 80. I won’t (publicly) scoff at 100 though. IMO 120 is reasonable for HTML and Java, but that’s about it.

Sent from my 49” G9 Ultrawide.

guenthert 5 days ago|||

Obviously 100 is the right choice.

https://en.wikipedia.org/wiki/Line_length#cite_note-dykip-8

Joker_vD 5 days ago||||

Give a try to 132 mode, maybe? It was the standard paper width for printouts since, well, forever.

balamatom 5 days ago|||

That's actually just weirdly specific enough to be worth a shot.

psychoslave 5 days ago|||

Printing industry have not been anything close to forever, even writing is relatively novel compared to human spoken languages.

All that said, I'm interested with this 132 number, where does it come from?

Joker_vD 5 days ago|||

"Since forever" as in, "since the start of electronic computing"; we started printing the programs out on paper almost immediately. The 132 columns comes from the IBM's ancient line printers (circa 1957); most of other manufacturers followed the suit, and even the glass ttys routinely had 132-column mode (for VT100 you had to buy a RAM extension, for later models it was just there, I believe). My point is, most of the people did understand, back even in the sixties, that 80-columns wide screen is tiny, especially for reading the source code.

dcminter 5 days ago||||

Printers aside the VT220 terminal from DEC had a 132 column mode. Probably it was aping a standard printer column count. Most of the time we used the 80 column mode as it was far more readable on what was quite a small screen.

guenthert 5 days ago||

Not only a small screen by modern standards, but the hardware lacked the needed resolution. The marketing brochure claims a 10x10 dot matrix. That will be for the 80 column mode. That works out to respectable 800 pixel horizontally, barely sufficient 6x10 pixel in 132 column mode. There was even a double-high, double-width mode for easier reading ;-)

Interesting here perhaps is that even back then it was recognized, that for different situations, different display modes were of advantage.

dcminter 5 days ago||

> There was even a double-high, double-width mode for easier reading

I'd forgotten that; now that waa a fugly font. I don't think anyone ever used it (aside from the "Setup" banner on the settings screen)

I think the low pixel count was rather mitigated by the persistence of phospher though - there's reproductions of the fonts that had to take this into account; see the stuff about font stretching here: https://vt100.net/dec/vt220/glyphs

bloak 5 days ago|||

The IBM 1403 line printer, apparently.

typpilol 5 days ago||||

That's literally my setup everywhere. 120 for html/java/JavaScript and 80 elsewhere.

Really suites each language imo Although I could probably get away with 80, habit to use tailwind classes can get messy compared to 120

Cthulhu_ 5 days ago||

Caveat, my personal experience is mainly limited to JS/TS, Java, and associated languages. 120 is fine for most use cases; I've only seen 80 work in Go, but that one also has unwritten rules that prefer reducing indentation as much as possible; "line-of-sight programming", no object-oriented programming (which gives almost everything a layer of indentation already), but also it has no ternary statements, no try/catch blocks, etc. It's a very left-aligned language, which is great for not unnecessarily using up that 80 column "budget".

forrestthewoods 5 days ago||||

Ugh. 80 is the worst. For C++ it’s entirely unreasonable. I definitely can not reconcile “linters make code easier to read” and “80 width is good”. Those are mutually exclusive imho.

What I actually want from a linter is “120, unless the trailing bits aren’t interesting in which case 140+ is fine”. The ideal rule isn’t hard and fast! It’s not pure science. There’s an art to it.

anilakar 5 days ago|||

But a 49" ultrawide is just two 27" monitors side by side. :-)

account42 5 days ago||

Better yet, its three monitors with more reasonable aspect ratios side by side.

16:9 is rarely what you want for anything that is mainly text.

setopt 5 days ago||||

It’s tricky to find an objective optimum. Personally I’ve been happy with up to 100 chars per line (aim for 80 but some lines are just more readable without wrapping).

But someone will always have to either scroll horizontally or wrap the text. I’m speaking as someone who often views code on my phone, with a ~40 characters wide screen.

In typography, it’s well accepted that an average of ~66 chars per line increases readability of bulk text, with the theory being that short lines require you to mentally «jump» to the beginning of the next line frequently which interrupts flow, but long lines make it harder to mentally keep track of where you are in each line. There is however a difference between newspapers and books, since shorter ~40-char columns allows rapid skimming by moving your eyes down a column instead of zigzagging through the text.

But I don’t think these numbers translate directly to code, which is usually written with most lines indented (on the left) and most lines shorter than the maximum (few statements are so long). Depending on language, I could easily imagine a line length of 100 leading to an average of ~66 chars per line.

fmbb 5 days ago|||

> the theory being that short lines require you to mentally «jump» to the beginning of the next line frequently which interrupts flow, but long lines make it harder to mentally keep track of where you are in each line.

In my experience, with programming you rarely have lines of 140 printable characters. A lot of it is indentation. So it’s probably rarely a problem to find your way back on the next line.

forrestthewoods 5 days ago|||

I don’t think code is comparable. Reading code is far more stochastic than reading a novel.

For C/C++ headers I absolutely despise verbose doxygen bullshit commented a spreading relatively straightforward functions across 10 lines of comments and args.

I want to be able to quickly skim function names and then read arguments only if deemed relevant. I don’t want to read every single word.

layer8 5 days ago|||

100 is the sweet spot, IMO.

I like splitting long text as in log statements into appropriate source lines, just like you would a Markdown paragraph. As in:

    logger.info(
        "I like splitting long text as in log statements " +
        "into ” + suitablelAdjective + " source lines, " +
        "just like you would a Markdown paragraph. " +
        "As in: " + quine);

I agree that many formatters are bad about this, like introducing an indent for all but the first content line, or putting the concatenation operator in the front instead of the back, thereby also causing non-uniform alinkemt of the text content.

saagarjha 5 days ago|||

This makes it really annoying to grep for log messages. I can't control what you do in your codebase but I will always argue against this the ones I work on.

layer8 5 days ago||

I haven’t found this to be a problem in practice. You generally can’t grep for the complete message anyway due to inserted arguments. Picking a distinctive formulation from the log message virtually always does the trick. I do take care to not place line breaks in the middle of a semantic unit if possible.

saagarjha 5 days ago||

Yes, I find the part of the message that doesn't have interpolated arguments in it. The problem is that the literal part of the string might be broken up across lines.

bogomog 4 days ago||

And to add to this, you rarely need to read a log message when just visually scanning code, its fine going off the screen.

maleldil 5 days ago||||

Nitpick: this looks like Python. You don't need + to concatenate string literal. This is the type of thing a linter can catch.

Sohcahtoa82 4 days ago|||

IMO, implicit string concatenation is a bug, not a feature.

I once made a stupid mistake of having a list of directories to delete:

    directories_to_delete = (
        "/some/dir"
        "/some/other/dir"
    )
    for dir in directories_to_delete:
        shutil.rmtree(dir)

Can you spot the error? I somehow forgot the comma in the list. That meant that rather than creating a tuple of directories, I created a single string. So when the `for` loop ran, it iterated on individual characters of the string. What was the first character? "/" of course.

I essentially did an `rm -rf /` because of the implicit concatenation.

layer8 5 days ago|||

It’s actually Java, where the “+” is necessary.

forrestthewoods 4 days ago|||

Splitting log messages across lines like that is pure evil. Your punishment is death by brazen Bull. Sorry I don’t make the rules, just how it is. :(

smokel 5 days ago|||

I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

Note that, in my mind, this visualization is not automatically generated, but lovingly created by humans who wish their code to be understood by others. It is not separate from the code, as typical design documentation is, but an integral part of it, stored in metadata. Consider it an extension of variable and function naming.

There is of course "literate programming" [1], but somehow (improvements of) that never took off in larger systems.

[1] https://en.wikipedia.org/wiki/Literate_programming

AdieuToLogic 5 days ago|||

> I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

My guess is it is the same reason why the most common form of creating source code is typing and not other readily available mechanisms:

  Semantic density

Graphical visualizations are approachable representations and very useful for introductory, infrequent, and/or summary needs. However, they become cumbersome when either a well-defined repetitive workflow is used or usage variations are not known a priori.

An example of both are the emacs and vi editors. The vast majority of supported commands are at most a few keystrokes and any programming language source code can be manipulated by them.

jraph 5 days ago||||

> I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

I suppose this is because nobody has been able to create good tooling for it (the visualization itself, the efficient editing, etc). You'll have to deal with the text version of it at some point if not all tools that we rely on get a version for the new visualization.

Another hypothesis is that it might not matter this much that we work with text directly after all.

> Note that, in my mind, this visualization is not automatically generated, but lovingly created by humans who wish their code to be understood by others.

If you allow manual crafting there, I suspect you'll need some sort of linting too.

seer 5 days ago||

Um isn't that what Lisp and its children / siblings have been all about. I've written a bit of Closure it has a very clear idea that code is data and data is code. Your code is trivially serializable in your mind and by various tools, and because it is lisp - it all kinda makes sense.

I really wish we lived in a universe where a lisp became the lengua franca of the world instead of javascript, as almost happened with Netscape, but alas ...

jraph 5 days ago||

The "code is data" aspect of lisp seems orthogonal to how code is still written as text, and btw lisp is still written using text. You still need to indent all these parentheses.

Virtually all programming languages are parsed into ASTs, and these ASTs can be serialized back. This is what formatters/"prettifiers" usually do.

Did I miss something?

rerdavies 5 days ago|||

    We must include the standard I/O definitions, since we want to
    send formatted output to stdout and stderr.
    <<Header files to include>>=
    #include <stdio.h>
    @

Not hard to see why nobody really embraced it. And not helped buy the fact that it was published right around the time that best practice was switching toward "don't comment unless absolutely necessary".

kolme 5 days ago|||

I did that when I was young and naive. I'll tell you why I did it.

I thought I was very smart. Like, really really smart, maybe the smartest programmer in the team.

And as such my opinion was very important. Maybe the most important opinion in the team. Everyone had to listen to it!

That is all. Also, I was wrong.

robertlagrant 5 days ago||

> Also, I was wrong.

This is probably the only useful takeaway, but can you explain why you were wrong?

kolme 5 days ago||

Yes, I was wrong on several levels.

First and foremost I was wrong thinking that I was smarter than others — that's not even how intelligence works.

Second I was wrong being so stubbornly pro-tabs / anti-spaces (for example). It doesn't make that much of a difference, so there's no point in being so passionate about it.

And third I was wasting everyone's time (and my persuasion powers) by not choosing my battles more wisely.

My suggestion would be nowadays: let's choose a popular style guide, set up a linter and be done with it.

schneems 5 days ago|||

I learned to love rustfmt but there’s one thing that bothers me: There’s a few times where there are two ways to do something like a one line closure can omit the curly brackets, but multi line closures cannot. Rustfmt prefers to remove those brackets when it can, but I prefer to keep them, which makes editing the code faster since I don’t have a syntax error if I suddenly need a second line.

I can still live with it. And I like the clean, minimal version when I don’t have to edit. Just adding that “style” can have impact beyond how it looks involving ease of editing. And it stinks when your preferences clash with the community.

rs186 5 days ago|||

That is true if a set of good linting rules are set up, those that help discover errors or other code smells which are valid issues in 99% of cases, or pure formatting rules when there is no "correct" thing to do. Linting becomes a problem when it is opinionated and has questionable rationale to begin with, and stands in your way instead of help you catch issues. Nobody should be fighting linting rules, but sadly that's what often happens.

See my other comment: https://news.ycombinator.com/item?id=45166670

torginus 5 days ago|||

The problem is that tools like ESlint often come with highly opinionated rules that might not even be applicable all of the time (leading to me having to manually turn them off via annotations)

And there's no centralized idea on best practices.

Cthulhu_ 5 days ago|||

ESLint is the centralized idea I suppose, but getting consensus is difficult.

When it comes to formatting, there's other languages (Go, Python?) that have clear, top-down guidelines applied by tooling, at least for code style. I think that's clever, and besides the odd mailing list post trying to change it because of a personal preference, it minimizes discussions about trivialities over the really important things.

Because 2 vs 4 spaces or line length discussions are ultimately futile; those aren't features, individual preferences don't matter. Codebases have millions of lines and thousands of developers; individual opinions do not matter at scale, consistency does.

HelloNurse 5 days ago||||

And best practices depend.

Recently, I discovered that the ruff linter for Python doesn't like the assert statement, because since it does nothing in "optimized" mode it isn't reliable. But such complaints about unit tests are not particularly useful.

mr_mitm 5 days ago||

Rule S101 [1] is not in the default settings. If you choose to enable it, you have the possibility of disabling it for your tests like so:

    [tool.ruff.lint.per-file-ignores]
    "tests/*" = ["S101"]

(Besides, this was about formatting, not linting, but I realize it's related.)

[1] https://docs.astral.sh/ruff/rules/

__alexs 5 days ago|||

eslint is slow and has terrible UX. Use Biome instead.

torginus 5 days ago||

Hi, I'm you from the future. Biome is slow and has terrible UX. Use Tokamak instead (/s)

__alexs 5 days ago||

Biome has several advantages that make this future unlikely and evidence from similar attempts in other languages seems to support their direction. Such pessimism is unwarranted.

Cthulhu_ 5 days ago|||

But (at least for a long time), "run the linter automatically" wasn't available, not until Go's gofmt put the idea into people's heads that they could leave it to a tool. I think there were some formatting tools before then, but e.g. jslint/eslint had a lot of gaps which I unfortunately ended up pointing out in code reviews a lot. Which was nitpicking / bikeshedding, in hindsight.

memset 5 days ago||

Interestingly, for over 30 years, C has had “indent” https://www.gnu.org/software/indent/manual/indent.html

psychoslave 5 days ago||

What are the defaults, though, as not everyone seems to agree with GNU coding style?

>First off, I’d suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it’s a great symbolic gesture.

https://www.kernel.org/doc/html/v4.10/process/coding-style.h...

worldsayshi 5 days ago|||

I agree. Linters are one of the more frustrating aspects of modern dev. It's of such little relevance and yet it takes up a sizeable portion of my time when I'm going for a merge. Many editors/language combinations don't give automatic linting out of the box and when they do I can bet that the rules they infer is different from what the CI pipeline infers.

yoyohello13 5 days ago|||

Same! I have no patience for these kind of arguments about formatting. I don't care that you don't like what the formatter does, it isn't about you. I've written code in several different languages over the years and the main take away is that I can get used to reading anything. It's so important to pick a standard and follow it. As long as that standard is somewhat sane I couldn't care less what the actual standard is.

Another argument that is a pet peeve of mine is significant white-space vs curly braces. It literally doesn't matter. We often get new Python developers coming from a C# background and the amount of bitching about curly braces is so annoying. Just learn the language bro, it's not that hard.

ParetoOptimal 5 days ago|||

> I promise after a week you'll just get used to whatever format your team lands on.

Arthur Witney formats like this:

    C vt[]="+{~<#,";
    A(*vd[])()={0,plus,from,find,0,rsh,cat},
     (*vm[])()={0,id,size,iota,box,sha,0};

If your code was formatted automatically like that, do you think you'd get used to it after a week?

My point is there is meaning of how code is formatted and there is an effect on understanding for certain people.

I think that at a certain point of "reasonable" and for most "normal" people your statements hold true, but I don't want anyone to think that every person caught up on formatting is just doing it for bike-shedding or other trivial reasons.

I don't know what is actionable if what I say is true, but it feels important to say.

fallpeak 5 days ago||

The unreadability of that example has approximately nothing to do with code formatting, which is generally understood to refer to modifying the textual representation of the code while leaving the actual logic more or less unchanged. Can you propose some alternative whitespace or indentation scheme which would make that example significantly more readable?

aequitas 4 days ago|||

It’s very simple: format code to a standard. Preferably the language default formatting. But it must be a standard that can be auto formatted to with a tool. Now when someone doesn’t like that standard, they can auto format from that standard to one of their liking for local development and back again to the project standard for pushing to the project. This can even be done automatically with gitattributes during checkout and commit. But without strictly enforcing a autoformatable standard this is not possible and you end up with bikeshedding.

duxup 5 days ago|||

I'm in the same boat. I have not run into any situations where someone's choice on formatting was bad enough that I couldn't read the code so ... just pick a format / standard and let's go. I'll get used to it if I'm not already.

anbotero 5 days ago|||

Those that complain:

I've worked with several Development Leads to actually define these. After the initial adjustment period, everybody's local environment setup properly: No one ever spent time reviewing style and formatting on Pull Requests.

Just decide as a team, auto-apply if possible (less than 5 seconds for big changes), enforce, and be done with it. Stop wasting everybody's time because after weeks you cannot make your mind on it and also don't tell your team/Lead about it.

garbagepatch 5 days ago|||

> just make a choice

Now you are bikeshedding. Just go with the defaults.

wartijn_ 5 days ago||

Which defaults? The programming languages I’ve worked with don’t have defaults for everything related to formatting. Editor defaults don’t work, since not everybody uses the same editor. So you have to make a choice somewhere.

swiftcoder 5 days ago|||

A lot of (relatively) recent languages do have defaults. Go and rust both come with an auto formatter out of the box, and defaults that are sane enough to just run with

wartijn_ 5 days ago||

Ah yeah, in those cases it is possible to just use the defaults. Come to think of it, I have worked with Deno, which comes with a formatter (and linter and testing library) and I’m a fan. Saves a couple of dependencies, some config files and a bit of mental overhead when creating a new project.

stavros 5 days ago||||

I guess the GP means "use an opinionated formatter", I agree with both of you.

genericspammer 5 days ago|||

Please inform me what the defaults are for Java, C#, C++, C, Bash and Python?

sfn42 5 days ago||

As far as C# goes there's `dotnet format`. You can use it as is or provide an `.editorconfig` file to customize it.

mhh__ 5 days ago|||

Some styles can actively make some people less productive though e.g. I really try to avoid allman braces because I can work a lot better with denser (for a certain definition of dense code)

This, however, usually doesn't effect me if the official format for a project is one way or the other because [drumroll] I just format my tree differently and then format to the official style when I push.

Bender 5 days ago|||

I can see why people prefer particular styles so it's easier to read but on that note with Perl it was just perltidy flags. I can run perltidy on any code anyone here writes and it's easy for me to read, then I can pass it back to whomever and they can run perltidy with their favorite flags and it's easy for them to read. It probably doesn't quite work this way with all languages. I would imagine python being less flexible in this regard.

VonGallifrey 4 days ago|||

> just make a choice, run the linter automatically and be done with it.

Most people probably do this. These types of discussions (probably) come up when someone else made the choice and other people also need to adhere to this choice. This is important for teams, but sometimes big egos don't want these choices made for them.

scott_w 5 days ago|||

> It's so obviously bikeshedding

I think you just answered your own question ;-)

patwolf 5 days ago|||

I went through this on a few projects, and what surprised me the most was that some devs have very strong opinions about import ordering. I mostly rely on the IDE to manage imports, and most the time they're not even visible. We had to add a lot of prettier rules to get import orders just right.

sotix 5 days ago|||

A strong reason I enjoy Rust for collaboration is that it's so opinionated, it forces people to focus on solving real problems. I agree that bikeshedding over ES Lint and Prettier configs are not a strong use of time.

forrestthewoods 5 days ago|||

I’ll go a step further.

I’ve never understood why people care so much about the linter. Just let people write code and don’t worry about the linter. I don’t need to fight a linter which makes my code worse when I could just write it in a way that doesn’t suck. I promise it’ll be fine. I’m too busy doing actual software engineering to care if code is not perfectly formatted to some arbitrary style specification.

I feel like style lingers are horseshoe theory. Use them enough and eventually you wrap back around to just living without them.

ngruhn 5 days ago|||

Why is this flagged? I completely agree.

99% the linter is not enforcing correctness in my experience. It's just enforcing a bunch of subjective aesthetic constraints. Which import order, max number of empty lines between statement, what type of string literal to use, no trailing white space, etc. A non trivial part of my day is spent dealing with this giant catalog of dinner etiquette. Not all of it is auto fixable. Also, there are plenty of situations where everyone would agree that violating the rule is necessary (eg. "no use before define" but you need mutual recursion). Also sometimes rules are circularly in conflict (eg you have to change a line but there is no way to do it without violating the max-line-length rule).

MrJohz 5 days ago|||

If your linter is enforcing subjective aesthetic constraints, then I'd argue it's not really a linter but a formatter at that point, and it should be automatically fixing all that stuff for you rather than have you do that manually. Things like import order, empty lines, white space etc can all be fixed automatically in most languages I've worked with.

Linters enforcing rules that need to be broken is a pet peeve of mine, and I agree with you there. Most linters allow for using comments to explicitly exclude certain lines from being linted. This should ~never be necessary. If it is regularly necessary, then either you're programming bad (always a possibility!) or the rule has too many false positives and you should remove it.

komali2 5 days ago|||

Does your linter not have a fix command for things like import order? Does your editor not auto trim white space?

To be frank, everyone I've worked with that complained about the linter didn't know much about their tooling. They didn't know about the fix command (even though I put it in the readme and told them about it), they didn't know how to turn on lintfix and prettier on save, wouldn't switch on git hooks and didn't know their lint failed until GitHub said so, and none of the people like this were so productive that it made up for this trait.

skinner927 5 days ago||||

The point of linters is so the code looks the same regardless of who wrote it. This way it’s easier to read. Some people have horrible style and linters really help.

I find linters make me faster. Sometimes I’m feeling lazy and I just want to pump out a bunch of lines of ugly code with mappings poorly formatted, bad indents, and just have it all synched up when I save.

carlosjobim 5 days ago||

I see no reason to accommodate to worthless programmers who aren't able to read or format the code that's sent to them. They can lint it themselves if they want.

mindwok 5 days ago||

It's not about accomodating people, it's about consistency in codebases when many people with different preferences or styles are working on it. You just eliminate the cognitive overhead so people can develop intuition about how the code works and flows.

carlosjobim 5 days ago||

Good, let huge companies do that. Now this lint madness is pushed onto everybody who wants to program, including individual hobbyists. And many of them are trying to learn coding and get their compiling sabotaged by the linter, not knowing it's something they can opt out of.

pletnes 5 days ago||||

Some linters find issues you care about. Forgotten print statements or confusing indentations come to mind. I’ve worked with people who easily forget, and I’m one of them myself.

tacitusarc 5 days ago||||

That may be true for you, but I have worked with plenty of devs who cannot even be consistent with naming conventions in a function, let alone throughout the application.

Don’t get me wrong: modern liners often annoy me and devs who spend a lot of time fiddling with those settings tend not to be very good programmers. But sometimes having guardrails is necessary.

thfuran 5 days ago||||

Everyone has their own opinion of what format doesn’t suck, so without a consistent code format, you’ll have to review diffs where fights over white space are mixed in with the meaningful change.

maratc 5 days ago||||

Agree.

There's a python linter named `black` and it converts my code:

    important_numbers = {
        "x": 3,
        "y": 42, # Answer to the Ultimate Question!
        "z": 2
    }

into this:

    important_numbers = {"x": 3, "y": 42, "z": 2}  # Answer to the Ultimate Question!

This `black` is non-configurable (because it's "opinionated") and yet, out of some strange cargo cult, people swear by it and try to impose it on everybody.

iainmerrick 5 days ago|||

This is the flip side of "I’ve never understood why people care so much about the linter"!

Why are you caring about formatting? Just write your code, get it working, let Black tidy it up in the standard way. Don't worry about the formatting.

In cases where you're annoyed about some choice the formatter makes, somebody else would be equally annoyed by the choice you would rather make. There is no perfect solution. The whole point is to have a reasonable, plausible default, and to automate it so that nobody has to spend any time thinking about it whatsoever.

Running a standard formatter when code is checked in minimizes the source control churn due to re-formatting. That churn is a pointless waste of time. If you don't run a standard formatter, I guarantee that badly-formatted code will make it into source control, and that's annoying.

maratc 5 days ago|||

I may be unusual in a way I treat my profession and care about my professional output (the code I write), and I take both very seriously.

There's a quote from Steve Jobs (or maybe his carpenter father):

    “When you’re a carpenter making a beautiful chest of drawers, you’re not going to use a piece of plywood on the back, even though it faces the wall and nobody will ever see it. You’ll know it’s there, so you’re going to use a beautiful piece of wood on the back. For you to sleep well at night, the aesthetic, the quality, has to be carried all the way through.”

When you say "Don't worry about the formatting", what you're saying is "use a piece of plywood on the back," and I'm just not going to do that.

iainmerrick 5 days ago||

I don't think we'll ever fully agree, but I'd just like to clarify that I value that kind of craftsmanship too!

I just honestly believe that if you fully automate the formatting, the results are better than if you do it painstakingly by hand; better by virtue of being more consistent. It's using the right tool for the job.

tacitusarc 5 days ago||

Did you read the example pietnas gave? The changed formatting ruined the communicative intent of his code. Formatters do that a lot, and it makes the code unambiguously worse.

I don’t really care about whether the back is plywood or whatever. I don’t know how to write plywood code. I do care about creating clear, readable code that communicates my intent. Sometimes formatters help with that. Often they hinder, as they reflect the arbitrary aesthetic preferences of their creators.

iainmerrick 4 days ago||

I don't see "pietnas" anywhere; do you mean the "important_numbers" example from maratc?

If so, I think a trailing comma is the correct fix, as described here: https://news.ycombinator.com/item?id=45168308

In this case I think the trailing comma is an improvement, so the formatter is steering you towards a better overall solution. However, even if you dislike the trailing comma, it's more important for the formatting to be consistent and robust, so I still think it's better to work within the limitations of the formatter.

thfuran 5 days ago|||

That’s an obviously terrible formatting change. A format that prevents scoping comments narrowly is absurd. Why not just tuck all the inline comments at the end of the file so the code is denser while we’re at it?

iainmerrick 5 days ago||

It works the way you want if you add a trailing comma:

  important_numbers = {
    "x": 3,
    "y": 42,  # Answer to the Ultimate Question!
    "z": 2,
  }

You might complain that that seems a bit obscure, but it only took me 10 or 20 seconds to discover it after pasting the original code snippet into an editor.

The trailing comma is an improvement as it makes the diff clearer on future edits.

Edit to add: occurs to me that I oversimplified my position earlier and it probably looks like I'm trying to have it both ways. I do advocate aiming for clean and clear formatting; I'm just against doing this manually. You should instead use automation, and steer it lightly only when you have to.

For example, I explicitly don't want people to manually "tab-align" columns in their code. It looks nice, sure, but it'll inevitably get messed up in future edits. Better to do something simpler and more robust.

maratc 5 days ago||

The trailing comma communicates an intent of possibly adding more things in the future. I actually use it quite a lot -- when I have that intent.

In the above example, if I think I have listed all of the `important_numbers`, there is a certain point of not having the trailing comma there.

Here's another terrible example from `black`:

From this:

    my_print(f"This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}", 
        a=1, b=2)

To this:

    my_print(
        f"This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}",
        a=1,
        b=2,
    )

The trailing comma it added makes no sense whatsoever because I can not have an intent of adding more things -- I've already exhausted the parameters in the string!

On the top of it, I don't quite get why I need to change the way I write in order to please the machine. Who should be serving whom?

Edit: changed "print" to "my_print" to not have to argue about named parameters of print ("sep", "file" etc.).

Edit 2: here's a variant that `black` has no issues with whatsoever. It does not suggest a trailing comma or any other change:

    my_print(f"This string has two params, `a` which is {a} and `b` which is {b}", a=1, b=2)

So an existence of a trailing comma is a product of string length?

lenzm 5 days ago||

Yes, it gets a trailing comma if it's on it's own line. That way when you add/remove arguments in a multi-line call it's only a one-line diff. This doesn't apply when the diff is only one line anyway.

Who's to say you don't add a new argument to the function in the future, like

    my_print(
        "This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}",
        a=1,
        b=2,
        color_negative_red=True,
    )

maratc 5 days ago||

> it gets a trailing comma if it's on it's own line.

Sorry but it doesn't make any sense to me. If your argument is "a trailing comma is a good thing," it should go into any and all function calls/list declarations/etc. Who's to say I won't add this in the future:

    my_print("a={a}, b={b}", a=1, b=2, color_negative_red=True)

So do I need to have this now?

    my_print("a={a}, b={b}", a=1, b=2,)

There's a very responsive playground at https://black.vercel.app/ and whatever it does looks strange to me, because the underlying assumptions look inconsistent one with the other (to my eye at least.) Specifically, "the length of the string should decide whether there is a trailing comma or there isn't" makes zero sense.

thfuran 5 days ago||

>Sorry but it doesn't make any sense to me. If your argument is "a trailing comma is a good thing," it should go into any and all function calls/list declarations/et

No, the argument is quite specifically that a one line diff to add a new argument/element to the end of a list is preferable to a two line diff to do the same thing. The presence of the trailing comma is necessary to achieve that only when elements are on their own line.

maratc 5 days ago||

Ok, we're then back to `print` example:

    print(
        'Hello there from a very long line abcdefghijklmnopqrstuvwxyz',
        sep=' ', 
        end='\n', 
        file=None, 
        flush=False,
    )

All of the existing named parameters to `print()` function are already provided, and that standard function is highly unlikely to change. Should I add another string to `print`, I will have to do it before the named parameters anyway. There is no sense in the trailing comma here however you look at it.

Edit: sorry for using single quotes, in my 20 years of writing Python it was never an issue, but now with `black` it apparently is.

iainmerrick 4 days ago||

I think this boils it down to the essence. Whether you use a trailing comma here, and whether you use single or double quotes, is just bike-shedding. If there's an automated tool that can make a consistent choice everywhere, that's worthwhile.

Revisional_Sin 5 days ago|||

Putting a trailing comma stops that.

maratc 5 days ago||

https://news.ycombinator.com/item?id=45169331

raincole 5 days ago||||

People like you are the exact reason why linter is a thing.

forrestthewoods 5 days ago||

Don’t be rude.

I write perfectly legible code. More legible than a linter infact. Because the rules for what is ideal are not so simple as to be encoded in simple lint rules. Sure it gets like 95%. But the last 5% is so bad it ruins the positives.

If your goal is “code that is easy to read and understand” then a linter is only maybe the first 20%. Lots of well linted code is thoroughly inscrutable.

raincole 5 days ago|||

> I write perfectly legible code

I 100% believe you. And for god's sake please use linter.

British and American spelling are both 100% legible English. But when multiple people coauthor a book, they should stick to one instead of letting each author use their favorite spelling.

stavros 5 days ago||||

I disagree. It gets you 95%, and do you know how many people are better than that? One in twenty.

I'll gladly pay the price of making the one person's code worse if it improves the other nineteen's.

genericspammer 5 days ago|||

Im sure you write very readable code, but in most companies, there are a bunch of devs who completely rape the codebase with unintelligble bullshit. The linter is the first line of defense against these bozos, unfortunately it must be enforced company wide.

jwilber 5 days ago|||

But what’s the issue? Setting lint rules is one and done - running pre-commit can be made automatic?

deadbabe 5 days ago|||

It’s more of a political thing. Controlling the linter is the first step of kingdom building.

einpoklum 5 days ago|||

If you ride a bike every day, bike sheds are rather important.

If you write and edit and read and search code every day, code formatting is rather important.

genericspammer 5 days ago|||

The point is that in the large picture there are many much more important topics with higher impact to focus on. The company wont make much more money by having consistently formatted code, compared to putting that energy towards new features.

onion2k 5 days ago||||

Consistency is important because it helps you pattern match.

What the pattern is doesn't really matter.

DonHopkins 5 days ago|||

You're missing what the bike shedding metaphor is about. It's not about having bike sheds or not, it's about coloring bike sheds, which every day bike riders in their right mind really don't give a shit about, because it doesn't affect their life in any tangible way.

Moomoomoo309 5 days ago||

No, the original metaphor is they were planning to build a nuclear reactor and they spent significantly more time than expected on the details of the bike shed because it was simple to understand and change, unlike the details of the reactor which were complex and required expertise and had lots of constraints. Who cares what color the bike shed is, we're building a nuclear reactor here!

anonymars 5 days ago||

"Sigh, this guy is pedantically missing the...oh"

Took me a sec, but well played

xpe 5 days ago||

I suggest rephrasing as a series of question:

1. Assuming at least one person who cares about linter settings isn't utterly confused or moronic, what are their self-described reasons why they care? People's work styles, brains, and even sensory perception differ in some important ways!

2. As freedom-loving developers [1] who want to make our own choices to help our own styles of work, why should we even have to care about "enforcing" one standard for something that isn't really necessary? This one-standard-per-project thing is a downstream result of a design decision upstream (storing source code as plain text).

3. How should we design languages going forward? This brings the conversation back to top-level post (which is why we're here -- to think about what languages could be, not to rehash tired old debates, after all): how can we take what we've learned and build better languages -- perhaps ones where the primary source of truth for source code is not plain text?

[1] Slightly tongue-in-cheek. It is one thing to want to have freedom to do our jobs well, it is another thing to turn this into advocacy an overarching system such as a political philosophy or various decentralized financial mechanisms and so on. Here, I'm merely referring to the "let me do my job in the way that actually works for my brain" sense.

kelseyfrog 5 days ago||

The tradeoff here is not being able to use a universal set of tooling to interact with source files. Anything but text makes grep, diff, sed, and version control less effective. You end up locked into specialized tools, formats, or IDE extensions, while the Unix philosophy thrives on composability with plain text.

There's a scissor that cuts through the formatting debate: If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?

gr__or 5 days ago||

Text surely is a hill, but I believe it's a local one, we got stuck on due to our short-sighted inability to go into a valley for a few miles until we find the (projectional) mountain.

All of your examples work better for code with structural knowledge:

- grep: symbol search (I use it about 100x as often as a text grep) or https://github.com/ast-grep/ast-grep

- diff: https://semanticdiff.com (and others), i.e.: hide noisy syntax only changes, attempt to capture moved code. I say attempt, because with projectional programming we could have a more expressive notion of code being moved

- sed: https://npmjs.com/package/@codemod/cli

- version control: I'd look towards languages like Unison to see what funky things we could do here, especially for libraries. A general example: no conflicts due to non-semantic changes (re-orderings, irrelevant whitespaces, etc.)

zokier 5 days ago|||

But as the tools you link demonstrate, having "text" as the on-disk format does not preclude AST based (or even smarter) tools. So there is little benefit in having non-text format. Ultimately it's all just bytes on disk

gr__or 5 days ago|||

Even that is not without its cost. Most of these tools are written in different languages, which all have to maintain their own parsers, which have to keep up with language changes.

And there are abilities we lose completely by making text the source of truth, like a reliable version control for "this function moved to a new file".

theamk 5 days ago||

At least the parsers are optional now - you can still grep, diff, etc.. even if your tools have no idea about language's semantics.

But if you store ASTs, you _have_ to have the support of each of the language for each of the tools (because each language has its own AST). This basically means a major chicken-and-egg problem - a new language won't be compatible with any of the tools, so the adoption will be very low until the editor, diff, sed etc.. are all updated.. and those tools won't be updated until the language is popular.

And you still don't get any advantages over text! For example, if you really cared about "this function moved to new file" functionality, you could have unique id after each function ("def myfunc{f8fa2bdd}..."), and insert/hide them in your editor. This way the IDE can show nice definition, but grep/git etc.. still work but with extra noise.

In fact, I bet that any technology that people claim requires non-readable AST files, can be implemented as text for many extra upsides and no major downsides (with the obvious exception of truly graphical things - naive diffs on auto-generated images, graphs or schematics files are not going to be very useful, no matter what kind of text format is used)

Want to have each person see it's own formatting style? Reformat to person's style on load and format back to project style on save. Modern formatters are so fast, people won't even notice this.

Want fast semantic search? Maintain the binary cache files, but use text as source-of-truth.

Want better diff output? Same deal, parse and cache.

Want to have no files, but instead have function list and edit each one directly, a la Smalltalk? Maintain files transparently with text code - maybe one file per function, or one file per class, or one per project...

The reason people keep source code as text as it's really a global maximum. The non-text format gives you a modest speedup, but at the expense of imposing incredible version compatibility pain.

gr__or 5 days ago||

The complexity of a parser is orders of magnitude higher than that of an AST schema.

I'm also not saying we can have all these good things, but they are not free, and the costs are more spread out and thus less obviously noticeable than the ones projectional code imposes.

theamk 5 days ago||

Are you talking about runtime complexity or programming-time complexity?

If the runtime, then I bet almost no one will notice, especially if the appropriate caching is used.

If the programming-time - sure, but it's not like you can avoid parsers altogether. If the parsers are not in the tools, they must be in IDE. Factor out that parsing logic, and make it a library all the tools can use (or a one-shot LSP server if you are in the language that has hard-to-use bindings).

Note even with AST-in-file approach, you _still_ need the library to read and write that AST, it's not like you can have a shared AST schema for multiple languages. So either way, tools like diff will need to have a wide variety of libraries linked in, one for each language they support. And at that point, there is not much difference between AST reader and code parser.

gr__or 4 days ago||

I meant programming-time, but runtime is also a good point.

Cross-language libraries don't seem to be super common for this. The recovering-sense-from-text tools I named all use different parsers in their respective languages.

Again, reading (and yes, technically that's also parsing) from an AST from a data-exchange formatted file is mags simpler. And for parsing these schemes there are battle-tested cross-language solutions, e.g. protobuf.

rafaelmn 5 days ago|||

Why even have a database - let's just keep the data in CSVs, we can grep it easily, it's all bytes on a disk.

gorgoiler 5 days ago||||

I feel it’s important to stick up for the difference between text and code. The two overlap a lot, but not all text is code, even if most code is text.

It’s a really subtle difference but I can’t quite put my finger on why it is important. I think of all the little text files I’ve made over the decades that record information in various different ways where the only real syntax they share is that they use short lines (80 columns) and use line orientation for semantics (lah-dee-dah way of saying lots of lists!)

I have a lot of experience of being firmly ensconced in software engineering environments where the only resources being authored and edited were source code files.

But I’ve also had a lot of experience of the kind of admin / project / clerical work where you make up files as you go along. Teaching in a high school was a great place to practice that kind of thing.

kelseyfrog 4 days ago||||

Thank you for your response. Conveniently, we can use an existing example - Clang's pch files. Could you walk me through using grep, diff, sed, and git on pch? I'd really appreciate it.

jrochkind1 5 days ago||||

So there was an era, as the OP says, where your arguments were popular and believed and it was understood that things would move in this direction.

And yet it didn't, it reversed. I think the fact that "plain text for all source files" actually won in the actual ecosystem wasn't just because too many developers had the wrong idea/short-sightedness -- because in fact most influential people wanted and believed in what you say. It's because there are real factors that make the level of investment required for the other paths unsustainable, at least compared to the text source path.

it's definitely related to the "victory" of unix and unix-style OSs. Which is often understood as the victory of a philosophy of doing it cheaper, easier, simpler, faster, "good enough".

It's also got to do with how often languages and platforms change -- both change within a language/platform and languages/platforms rising and falling. Sometimes I wish this was less quick, I'm definitely a guy who wants to develop real expertise with a system by using it over a long time, and think you can work so much more effectively and productively when you have done such. But the actual speed of change of platforms and languages we see depends on reduced cost of tooling.

gr__or 5 days ago||

For me, that's what "short-sighted inability" means. The business ecosystem we have does not have the attention span for this kind of project. What we need is individuals grouping together against the gradient of incentives (which is hard indeed).

Tooster 5 days ago|||

I’d also add:

* [Difftastic](https://difftastic.wilfred.me.uk/) — my go-to diff tool for years * [Nu shell](https://www.nushell.sh/) — a promising idea, but still lacking in design/implementation maturity

What I’d really like to see is a *viable projectional editor* and a broader shift from text-centric to data-centric tools.

The issue is that nearly everything we use today (editors, IDEs, coreutils) is built around text, and there’s no agreed-upon data interchange format. There have been attempts (Unison, JetBrains MCP, Nu shell), but none have gained real traction.

Rare “miracles” like the C++ --> Rust migration show paradigm shifts can happen. But a text → projectional transition would be even bigger. For that to succeed, someone influential would need to offer a *clear, opt-in migration path* where:

* some people stick with text-based tools, * others move to semantic model editing, * and both can interoperate in the same codebase.

What would be needed:

* Robust, data-native alternatives to [coreutils](https://wiki.archlinux.org/title/Core_utilities) operating directly on structured data (avoid serialize ↔ parse boundaries). Learn from Nushell’s mistakes, and aim for future-compatible, stable, battle-tested tools. * A more declarative-first mindset. * Strong theoretical foundations for the new paradigm. * Seamless conversion between text-based and semantic models. * New tools that work with mainstream languages (not niche reinventions), and enforce correctness at construction time (no invalid programs). * Integration of semantic model with existing version control systems * Shared standards for semantic models across languages/tools (something on the scale of MCP or LSP — JetBrains’ are better, but LSP won thanks to Microsoft’s push). * Dual compatibility in existing editors/IDEs (e.g. VSCode supporting both text files and semantic models). * Integrate knowledge across many different projects to distill the best way forward -> for example learn from Roslyn's semantic vs syntax model, look into tree sitter, check how difftastic does tree diffing, find tree regex engines, learn from S-expressions and LISP like languages, check unison, adopt helix editor/vim editing model, see how it can eb integrated with LSP and MCP etc.

This isn’t something you can brute-force — it needs careful planning and design before implementation. The train started on text rails and won’t stop, so the only way forward is to *build an alternative track* and make switching both gradual and worthwhile. Unfortunately it is pretty impossible to do for an entity without enough influence.

zokier 5 days ago||

But almost every editor worth its salt these days has structural editing.

https://docs.helix-editor.com/syntax-aware-motions.html

https://www.masteringemacs.org/article/combobulate-structure...

https://zed.dev/blog/syntax-aware-editing

Etc etc.

Tooster 5 days ago||

And that's a great thing! I look forward to them being more mature and more widely adopted, as I have tried both zed and helix, and for the day to day work they are not yet there. For stuff to take traction though. Both of them, however, don't intend to be projectional editors as far as I am aware. For vims or emacs out there - I don't think they mainstream tools which can tip the scale. Even now vim is considered a niche, quirky editor with very high barrier of entry. And still, they operate primarily on text.

Without tools in mainstream editors I don't see how it can push us forward instead of saying a niche barely anyone knows about.

jsharpe 5 days ago|||

Exactly. This idea comes up time and time again, but the cost/benefit just doesn't make sense at all. You're adding an unbelievable amount of complex tooling just to avoid running a simple formatter.

The goal of having every developer viewing the code with their own preferences just isn't that important. On every team I've been on, we just use a standard style guide, enforced by formatter, and while not everyone agrees with every rule, it just doesn't matter. You get used to it.

Arguing and obsessing about code formatting is simply useless bikeshedding.

scubbo 5 days ago|||

I disagree with almost every choice made by the Go language designers, but `Gofmt's style is no one's favorite, yet gofmt is everyone's favorite` is solid. Pick a not-unreasonable standard, enforce it, and move on to more important things.

spyspy 5 days ago||

My only complaint about gofmt is that it’s not even stricter about some things.

duskwuff 5 days ago||

Good news: there are tools like https://github.com/mvdan/gofumpt which fork gofmt and enforce stricter rules (while remaining invariant under gofmt).

rbits 5 days ago||||

Yeah it would probably be a waste of time. It's a nice idea to dream about though. It would be nice to be able to look at some C# code and not have opening curly brackets on a separate line.

mdaniel 5 days ago||

I say this fully cognizant of the thread in which it's posted, but these people are sick

https://astyle.sourceforge.net/astyle.html#_style=whitesmith

And then someone said: oh yeah? Hold my beer https://astyle.sourceforge.net/astyle.html#_style=pico

masklinn 5 days ago||

These are beginner horrors. For me nothing beats the insanity of the gnu style: https://astyle.sourceforge.net/astyle.html#_style=gnu

Buttons840 5 days ago||||

> Arguing and obsessing about code formatting is simply useless bikeshedding.

Unless it's an accessibility issue, and it is an accessibility issue sometimes.

mmastrac 5 days ago||

Maybe if you use 16-wide tabs or a 40 character line length.

raspasov 5 days ago|||

>> The goal of having every developer viewing the code with their own preferences just isn't that important.

Bah! So, what is more important? Is the average convenience of the herd more important? Average of the convenience, even if there was ever such a thing.

What if you really liked reading books in paper format, but were forced to read them on displays for... reasons?

rendaw 5 days ago|||

Grep, diff, sed, and line-based non-semantic merge are all terrible tools for manipulating code... rather than dig ourselves in either further with those maybe a reason to come up with something better would be good.

accelbred 5 days ago|||

What if the common intermediate encoding is text, not binary? Then grep/diff/sed all still work.

If we had a formatting tool that operated solely on AST, checked in code could be in a canonical form for a given AST. Editors could then parse the AST and display the source with a different formatting of the users choice, and convert to canonical form when writing the file to disk.

pmontra 5 days ago|||

All mainstream editors that agree to work on a standard AST for any given language could be nice. I'm not expecting that to happen at any time in future.

About grep and diff working on a textual representation of the AST, it would be like grepping on Javascript source code when the actual source code is Typescript or some other more distant language that compiles to Javascript (does anybody remember Coffescript?) We want to see only the source code we typed in.

By the way, add git diff to the list of tools that should work on the AST but show us the real source code.

sublinear 5 days ago|||

Nobody wants to have to run their own formatter rules in reverse in their head just to know what to grep for. That defeats the point of formatting at all.

pwdisswordfishz 5 days ago|||

That's why you grep for a syntactic structure, not undifferentiated text.

michaelmrose 5 days ago||

Which grep doesn't do and you need to either use a new different tool or more likely several for little real benefit

hnlmorg 5 days ago|||

grep is half a century old now.

If we can’t progress our ecosystem because we are reliant on one very specific 50+ year old line parser, then that says more about the inflexibility of the industry to move forward than it does about the “new” ideas being presented.

account42 5 days ago|||

We still use grep because its useful. And it's useful precisely because it doesn't depend on syntax so will work on anything text based.

hnlmorg 4 days ago||

grep is great. My point isn’t that we shouldn’t use it. My point is that we shouldn’t be held back by it.

komali2 5 days ago|||

The things all being described are way beyond non trivial to solve, and they'd need to be solved for every language.

Grep works great.

hnlmorg 4 days ago||

> The things all being described are way beyond non trivial to solve, and they'd need to be solved for every language.

Except it already is a solved problem.

If languages compile to a common byte code then you just need one tool. You already see examples of this with things like the IR assembly produced by LLVM, various Microsoft languages that compile to CLR, and the different languages that target JVM.

There are also already common ways to create reusable parsing rules like LSP for IDEs and treesitter.

In fact there are already grep-like utilities that are based on treesitter.

So it’s not only very possible to create language agnostic, reusable, tools; but these tools already exist and being used by a great many developers.

The problem raised in the article is that we just don’t push these concepts hard enough these days. Instead relying on outdated concepts of what source code should look like.

> Grep works great

For LF-separated lists it does. But if it worked great for structured content then we wouldn’t be having this conversation to begin with.

jitl 5 days ago|||

comby is fantastic, give it a shot. It’s saved me huge amounts of time.

theamk 5 days ago|||

You'd need all-news tools for non-text world as well.

So the real choice is either:

- new tool: grep with caching reverse-formatter filter.

- new tool: ast-grep with understanding of AST serialization format for your specific language.

At least in the first case, you still have fall back.

Avshalom 5 days ago|||

The entire OS was built around these source files.

the unix philosophy on the other hand only "thrives" if every other tool is designed around (and contains code to parse) "plain text"

lmm 5 days ago||

> The entire OS was built around these source files.

And how did that work out for them?

This seems like one of the many cases where unix won out by being a lowest common denominator. Every platform can handle plain text.

account42 5 days ago|||

Not all platforms come with powerful text handling tools out of the box - or at least they didn't used to until Unix-based systems forced them to catch up.

aleph_minus_one 5 days ago|||

> This seems like one of the many cases where unix won out by being a lowest common denominator.

The lowest common denominator rather is binary blobs. :-)

thfuran 5 days ago||

The conversion of which to text and back has historically proven rather fraught.

MyOutfitIsVague 5 days ago|||

The way I envision this working is with something like git filters. Checking out from version control converts it all into text in your preferred formatting, which you then work with as expected. Staging it converts it into the stored representation. In git, this would be done with smudge and clean filters, like how git LFS works. You'd also have viewers for forges and the like that are built to interpret all the stored representations as needed.

You still work with text, the text just isn't the canonical stored representation. You get diffs to resolve only when structure is changed.

You get most of the same benefit with a pre-commit linter hook, though.

zokier 5 days ago|||

The problem is that there is little benefit in not having the canonical stored representation be text. The crucial thing is to have some canonical representation but it might as well be human readable.

bapak 5 days ago||||

This is it, unfortunately git is "too dumb" for this. In order to merge code, it would have to either understand the AST.

What happens when you stage the line `} else return {`? git doesn't allow to stage specific AST nodes. It would also mean that you can't stage partial code (that produces syntax errors)

zokier 5 days ago|||

Git can use arbitrary merge (and diff) tools. Something like https://mergiraf.org/introduction.html works with git and gets you ast aware merging. Do not underestimate gits flexibility.

Hendrikto 5 days ago|||

Smudge and clean filters work on text, git would not need to change at all.

You would still store text, and still check out text, just transformed text. You could still check in anything you want, including partial code, syntax errors, or any other arbitrary text. Diffs would work the same way they do now.

account42 5 days ago|||

Please no, git trying to automatically "correct" \n vs \r\n line endings is already horrible enough. At least you can turn that off.

danielheath 5 days ago|||

If you’re going to store the source in a canonical format and unpack that to suit each developer… why should the canonical format just be regular source code?

All the same tools can exist with a text backend, and you get grep/sed support for free too!

psychoslave 5 days ago|||

That’s seems like a genious remark actually. If you store the abstract objects and have the mechanism to transform to whatever the desired output form is, it’s almost trivial to expose a version as files and text rendering for tools that are thus oriented, isn’t it?

danielheath 4 days ago||

Originally my fathers idea from back in the 90s to create a language with a whole suite of syntactic representations to suit your preferences.

Want it to look like C? Lisp? Pascal? Why not!

psychoslave 4 days ago||

Do you have more references about what your fathers did back then?

giveita 5 days ago|||

My grep may not work on your settings for the same code.

This becomes an issue with say CI where maybe I add a gate to check something with grep. But whose format do I assume? My local (that I used to test it locally) or the canonical (which means I need to switch local format to test it)?

brabel 5 days ago|||

You really rely on grep on CI? How fragile is that ?! This is a good argument for storing non-text. Grepping code is laughably unreliable. The only way to write things like that reliably is by actually parsing the code and working in its AST. Working in text is like writing code in a completely untyped language. It can be done, but it’s beyond stupid for anything where accuracy matters.

treadmill 5 days ago|||

You're misunderstanding the idea I think.

You would use the format on disk for the grep. "Your format" only exists displayed in your editor.

giveita 5 days ago||

Aha

eviks 5 days ago|||

> If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?

Yes, of course, because tab width is * dynamically* flexible, so initial space width isn't enough

pasc1878 5 days ago||

Yes because if you want to deindent with tabs it is just delete one character whilst spaces requires you top delete x characters where x is the number of spaces you indent by.

eviks 5 days ago||

For "clean-fixed-width" unambiguous indent (eg, at the beginning of lines) you can make delete also delete X=indent_width spaces.

But for "dirty-width" indents, eg, after some text that can vary in size (proportional fonts or some special chars even in fixed fonts) you can't align with spaces while a tab width can be auto-adjusted to match the other line

aleph_minus_one 5 days ago|||

> Anything but text makes grep, diff, sed, and version control less effective.

Perhaps this is rather a design mistake in how UNIX handles things and is so focused on text.

bee_rider 5 days ago|||

Is it possible converted from the DIANA ir back to something that looks like source code? Then the result of the conversion backward could be grepped, etc…

teo_zero 5 days ago||

From TFA:

> Everyone had their own pretty-printing settings for viewing [DIANA] however they wanted.

bee_rider 5 days ago||

> Back when he was working on Ada, they didn't store text sources at all — they used an IR called DIANA. Everyone had their own pretty-printing settings for viewing it however they wanted.

I’m still confused because the specifically call the IR DIANA, and they talk about viewing the IR. It isn’t clear to me if the IR is more like a bytecode or something, or more like just the original source code with a little processing done to it. They also have a quote,

> Grady Booch summarizes it well: R1000 was effectively a DIANA machine. We didn't store source code: source code was simply a pretty-printing of the DIANA tree.

So maybe the other visualizations they could do by transforming the IR were so nice that nobody even cared to look at the original ADA that they’d written to generate it?

brabel 5 days ago||

I imagine it’s like storing JVM bytecode, ie class files instead of Java files. So when you open it up the editor decompiles it , like IntelliJ does if you try to open a class file, but then it also applies your own style, like from .editorconfig, on the code it shows. It’s a really good idea and I can’t believe people here are complaining that it’s bad because they can’t use grep! But that’s a good thing!! Who the hell is grepping code as if code had no structure and that’s the best you can do? So you also grep JSON instead of using jq? Just don’t!

cowsandmilk 5 days ago|||

How is diff less effective? I see the diff in the formatting I prefer? With sed, I can project the source into a formatting most convenient for what I’m trying to do with sed. And I have no idea what you’re on about version control. It ruins sending patch files that require a line number around, but most places don’t do that any more.

What I would be curious on is tracing from errors back to the source code. Nearly every language I’ve used prints line number and offset on the line for the error. How that worked in the Diana world would be interesting to learn.

sublinear 5 days ago||

You'd have to run diff and sed before the formatter which is harder for everyone.

peanball 5 days ago||

I can only recommend difftastic[1], which is a language aware diff. Independent of linter that shows the logical diff, not an assortment of characters or lines that changed.

[1]: https://github.com/Wilfred/difftastic

charcircuit 5 days ago|||

In practice how many tools do you really need to handle the custom format? Probably single digits and they could all use a common library to handle the formatting aspect of things.

froh 5 days ago|||

yes, contemporary editors and tools like treesitter have decided this debate in favor of plain text file representation, exactly for the reasons you give: universal accessibility by general purpose tools.

xslt was a Diana like pre-parsed representation of dsssl. oh how I miss dsssl (a scheme based sgml transformation language) but no. dsssl was a lisp! with hygienic macros! "ikes" they went and invented XSLT.

the "logic" escapes me to this day.

no. plain text it is. human readable. and grep/sed/diff able.

Ygg2 5 days ago||

> would those who prefer tabs have any other arguments?

Yes. Because Yaml exists. And mixing tabs and spaces is horrible in it. And the rules are very finnicky.

Optimal tab usage is emit 2-4 spaces.

davetron5000 5 days ago||

There’s also a typography element to formatting source code. The notion that all code formatting is mere personal preference isn’t true. Formatting code a certain way can help to communicate meaning and structure. This is lost when the minimal tokens are serialized and re-constituted using an automated tool.

https://naildrivin5.com/blog/2013/05/17/source-code-typograp...

Mikhail_Edoshin 5 days ago||

And I'd add that typographers go out of their skin to typeset tables and formulae so that everything is aligned and has proper spacing. For centuries this was done manually because it it important, even though an outsider cannot notice it.

(That said, it must be possible to make a more sophisticated formatter for the source code too.)

frizlab 5 days ago|||

Yes! I’m always appalled that people cannot see that.

anticodon 5 days ago|||

Yes. In Python, black formatter consistently breaks SQLAlchemy queries in an unreadable way (e.g. splitting conditions over multiple lines when it's not really necessary and makes reading harder).

3036e4 5 days ago||

For C++ clang-format does things like that all the time as well. Of course it has no idea what semantically belongs together on the same line or not. I wish the C++ world had settled on some other standard linter.

IshKebab 5 days ago||

clang-format is probably the worst of the autoformatters. They tried to get fancy with a sort of global optimisation algorithm but in practice it's buggier and uglier than the classic Prettier algorithm which is elegant and generally works very well. It's also way less diff friendly.

I wouldn't draw any conclusions about autoformatters from clang-format.

psychoslave 5 days ago|||

Caring for typography but blindly bending to dubious programming-language convention feels really like putting efforts on the wrong starting point though.

What’s the point of such an heavy obfuscation of the intend, really? Let’s take the first example.

    char *
    strcpy(to, from)
            register char *to;
            register const char *from;
    {
            char *save = to;

            for (; (*to = *from) != 0; ++from, ++to);
            return(save);
    }

If we are fine with the "lengthy" register, why not use character in full word? Or if we want something shorter sign would be actually semantically more on point in general.

What with the star to design a pointer? Why not sign-pointer? Or pin for short if we dare to use a pretty straightforward metaphor, so sign-pin. Ah yes by the way, using "dot" (.) or "dash, greater than" (->) is such a typographical non-sense.

And as a side note *char brings nothing in readability compared to sign-pin-pin. Remember that most people read words or even word sequences as a whole. And let’s compare **char to something like sign-pin-back-5.

What with strcpy? Do we want to play code-obfuscation to look smart being able to decode this pile of letter sequence? What’s wrong with string·copy* or even stringcopy (compare photocopy)? Or even simply copy? If we want to avoid some redundant identifier without relying on overriding through argument types, English is rich in synonyms. For example duplicate, replicate, reproduce.

Various parentheses could be just as well optional to ease code browsing if proper typography is already on place, and English already provide many adverb/preposition that could replace/complement them into a linguistically more usual counterparts.

Speaking about prepositions, using from and to as identifiers for things which would be far more aptly described with nouns is really such a confusing choice. What’s wrong with origin/source and destination/target? It’s also a bit counterproductive to put the identifier, which is the main point of interest, at the very end of it’s declaration statement.

Equal for assignment is just really an artifact of more relevant symbol like ← or ≔ because most keyboard layouts stem from disastrous design. But using an more adequate symbol is really pushing for unnecessary obscured notation.

Mandatory semicolon to end a statement is obviously also a typographical nonsense.

If a parameter is to be left blank in for, we would obviously be better served with a separate control-flow construction rather than any way to highlight it’s not filled in that employ.

So packing it all:

     duplicate as function ⟨
          requiring (
               origin as sign-pin-register,
               destination as sign-pin-register
          )
          making {
               save as sign-pin
               save assigned origin
               destination-pin assigned origin-pin until ( zeroized,
                    whilst [
                        origin-increment,
                        destination-increment
                    wrought ]
               done )
               return save
          made }
     built ⟩

Given that in that case the parentheses and comas are purely ornamental, the compiler could just ignore them and would have enough information with something like

     duplicate as function
          requiring
               origin as sign-pin-register
               destination as sign-pin-register
          making
               save as sign-pin
               save assigned origin
               destination-pin assigned origin-pin until zeroized
                    whilst
                        origin-increment
                        destination-increment
                    wrought
               done
               return save
          made
     built

Or even

     duplicate as function requiring origin as sign-pin-register destination as sign-pin-register making save as sign-pin save assigned origin destination-pin assigned origin-pin until zeroized whilst origin-increment destination-increment wrought done return save made built

pwdisswordfishz 5 days ago|||

> A C argument declaration is made up of modifiers (register, const), a data type (char *), and a name (from).

Now explain a declaration like "char *argv[]"...

> We’ve also re-set the data type such that there is no space between char and * - the data type of both of these variables is “pointer to char”, so it makes more sense to put the space before the argument name, not in the middle the data type’s name (update: it should be pointed out that this only makes sense for a single declaration. A construct like char* a, b will create a pointer to char, a, and a regular char, b).

Ah, yes, the delusional C++ formatting style. At least it's nice that the update provides the explanation why it should be avoided.

yccs27 5 days ago||

My $0.02: Don't throw away a perfectly good mental model because of a compiler ideosyncasy. Just treat it as a special case and use a linter against stuff like char* a, b.

You also don't think about dollars differently than other units, just because the sign goes before the number.

jauntywundrkind 5 days ago|||

I'm pretty unconvinced by the examples.

> Some of us even align other parts of our code, such repeated inline comments

> Now, the arguments block forms a table of three columns. The modifiers make up the first column, the data types are aligned in the second column, and the names are in the third column

These feel like pretty trivial routines that can be encompassed by code formatting.

We can contrive more extreme examples, like the for loop, but super custom formatting ("typesetting") like that has always made me feel awkward, feels like it givesicemse for people to use all manners of arbitrary formatting. The author has some intent, but when you run into an inconsistent code based with lots of things going on, the variance doesn't feel informative or helpful: it sucks and it's a drain.

What's stored is perhaps more minimal, some kind of reference encoding, maybe prettier-ifies for js. The meat of this article to me is that it shouldn't matter: the IDE should let you view and edit as you like:

> Everyone had their own pretty-printing settings for viewing it however they wanted.

IshKebab 5 days ago||

Yeah in theory people can do a better job than auto-formatters. In practice they absolutely do not, so that argument is moot.

xpe 5 days ago||

> Yeah in theory people can do a better job than auto-formatters. In practice they absolutely do not, so that argument is moot.

Status quo fallacy alert. Arguments are not forever mired in a current state of affairs. People can learn and can build tools to help them do better.

This could change quickly; e.g. if Claude or GitHub or (Your Team) decide to prioritize how source code looks.

chowells 5 days ago||

I have to disagree with the premise. Formatting code is a critical communication channel. Well-formatted code should tell you:

1. The developer has enough experience to understand that formatting matters.

2. The developer has enough discipline to stick with their chosen formatting rules.

3. The developer has the taste necessary to choose good formatting rules.

4. The developer has the judgement necessary to identify when other concerns justify one-off violations of the rules.

These are really important attributes for a developer to have. They affect every aspect of the code, not just formatting. Formatting is just a very quick proxy to measure those by.

Unfortunately, things like autoformatting and linter rules are destroying the signal. Goodheart's law strikes again.

babel_ 5 days ago||

The blog entry is short and simple, perhaps consider reading it before knee-jerk reacting to the title, and then you might understand why "should" and "unnecessary" are operative in said title.

chowells 5 days ago||

You've jumped to a fascinatingly false conclusion here. Is this the so-called death of media literacy? I replied to the ideas underlying the post rather than the words in it, and you think that means I didn't read it?

To go through the details: The post explicitly complained about a linter enforcing style rules. It did not object to the presence of mechanically-enforced style rules. In fact, it glorified them implicitly by saying how great it would be if everything was formatted at presentation-time. This glorification is the exact thing I was criticizing.

I think machine-enforced rules are bad because they destroy a communication channel that importantly has point 4 that I listed - when well-formatted code breaks its conventions, there must be a reason for it. That is important information that enforced presentation rules force to be put into another channel.

And it's certainly true that other channels do convey this other information, but I find more value in having it conveyed in the presentation channel than I do in having that channel replaced by mechanistic formatting.

This is the premise underlying the article that I object to. It is present so heavily in the subtext that if you pretend it's not, the post becomes incoherent.

And FWIW, HN rules say not to accuse people of not having read the article. I think that rule is mostly there because someone can read the article and notice something you missed, and it's wiser to not post than it is to assume you absorbed 100% of the context of the post.

babel_ 4 days ago||

The blog post, in its opening section, directly points out:

> Everyone had their own pretty-printing settings for viewing it however they wanted

This is an example of how treating storage and presentation as two separate concerns obviates a large swathe of low-value yet high-friction concerns with current "draw it as you store it" plain text code.

>> It did not object to the presence of mechanically-enforced style rules

Quite the opposite, by my reckoning! I won't belabour the dissonance about "linter enforced" somehow not being "mechanically enforced", since I think that merely belies a different interpretation of those words to have a subtle difference I feel adds nothing to the conversation. Instead, note the prior quote, which is quite literally from the leading section, as pointing out how you don't have "mechanically enforced" rules in such a scheme as the blog suggests. In particular, by letting someone views code "however they wanted", in other words, we're not merely talking indentation or casing, we're talking about using code to present the code, potentially in a contextually relevant manner.

This is, in my mind, quite the opposite of mechanically enforcing a set of style rules, since that would result in a fixed, static presentation, akin to merely "what flavour of indentation do you like"... here, we see the idea of contextually presenting the code as per your current needs and wants, for example, to directly craft the "one-off" not as an exception to the norm, i.e. "please turn off for these lines so you don't disrupt the formatting or trip on a bunch of special cases", but rather as a "here's how you should present this specific thing" in a way that is at the heart of this entire endeavour in the first place: programming the logic to get the intended results, now simply reflected back upon the task of programming itself (and for arguably the most important part, reading the code). By establishing these "rules" and patterns, it focuses the task on how to make the code more readable as a direct consequence of considering how to present and format it, with the ability to handle the special cases in that "one-off" manner with simple hard-coded patterns (i.e. "when the code is like this, present it exactly like that"), but of course also accumulating and generalising to handle even more cases, only now able to perform the delicate, "hand-crafted" formatting on code you're only just looking at for the first time, finding that it is now already formatted exactly how you needed it, or can be switched to another contextual mode easily with a quick addition to a set of such places to activate it, or a direct command to present it in such a way regardless.

Likewise, nowhere did the article state that this could not be shared, as people are often wont to do. The blog doesn't even talk about what people are currently getting up to with similar ideas now, with a little more rendering capability than the 1980s could reasonably provide. So, this hardly seems like glorification, even when it discusses not having to waste time debating linter/autoformat settings with one another. Indeed, it holds back from mentioning what can be done with some of the ideas it so casually includes, such as live environments (think about it, the presentation reflecting the current meaning, semantics, values, or state of the code as it runs, or while testing/debugging! that's something we currently either lack in most editors/IDEs, or are relegated to perhaps some basic syntax highlighting changes) or some of the interesting ways some "refactors" are actually entirely superficial and can be reframed as presentational changes since they do not alter the underlying semantics (or literal IR), such as "what order should these variables be declared in?" and other similarly banal or indeed perhaps more serious and useful presentational shifts we could explore with better tools (such as exploring "order-of-operations" sequencing in the "business logic" for edge-cases or to improve clarity, finding equivalent but more intelligible database queries without impacting optimisation, etc) without the need for worrying how entirely superficial changes might need a meeting to decide how to handle merges because two people renamed the same function or its arguments or similar clashes that are completely brittle right now.

The current tooling, particularly the use of linters and similar static analysis for auto-formatting, is based on a compromise with the underlying conceit that the storage medium and the presentation must be mechanically connected with little room for alteration (I still see people claiming syntax highlighting is tantamount to sin, unironically, so the extreme positions here are alive and well, thankfully barring calls to magnetised needles) and that the form is given primacy over the function, syntax over semantics, which continues to bring in pointless discrepancies over what that form/syntax should be, precisely because we can and should disagree since our own needs and tastes are individual, yet are forced to come to some compromise purely for the sake of having some consistent, canonical form that will be presented identically on everyone's screen baring only editor/IDE level differences such as syntax highlighting, themes, fonts, or indentation. Those are perhaps the most superficial changes to presentation and formatting that could be made, yet they are the only one most code and editors "allows" the user to have control over so they can customise it to their own needs, perhaps even going so far as to quickly switch them up with shortcuts or commands.

Now, with that in mind, we reflect on the blog, and on the use of some canonical storage of code (minified code, IR, or simply language-specific canonical formatting) with the explicitly non-canonical presentation, alleviating the concerns about people disagreeing over how to format a 2D array or something similarly innocuous, since they are all free to format it exactly however they please, either with a more manual "pushing syntax around" approach akin to moving characters/symbols in an editor, or programmatically extending from "pretty printing" into a rich, contextual and dynamic approach, which you as the programmer are free to configure to meet your exact needs.

Does that sound like a glorification of "mechanically enforced" style rules? Like it's destroying the signal rather than trying to expose and even amplify it? Like there is no room for us humans to craft and refine how something is presented to make it more intelligible and understandable? I hope not. Because, by my reckoning, this blog and the ideas it's discussing are perhaps one of the few directions we could reasonably and understandably start down to resolve the issues you, I, and the blogger are all agreeing on here, and with clear historical precedent to show it's not only achievable, but that it was achievable with only a fraction of the hardware and understanding widely available today. The "subtext" here feels quite contrary to how you are presenting it, though assuming such a "subtext" would indeed make the blog less coherent due to the continued cognitive dissonance of assuming the blog is suggesting "take away autoformatting/linting and then add it back in by a different name", instead of the really quite significant change it's actually suggesting we can do... and, indeed, wouldn't even have to change much these days to achieve it, with canonical or even somewhat minified code being perfectly acceptable for a line-oriented VCS to handle, without needing to figure out a suitable textual representation for the IR or otherwise needing to handle a dedicated binary/wire format.

Oh, and FWIW, to your FWIW, I felt it was the correct way to approach the comment, given that the substance of the blog post was not reflected in a comment that focused entirely on "formatting code" as in the title, in such a way that could be composed wholesale by riffing solely on the title. No direct reference or allusions to specific points in the blog were made, nor anything about what the blog actually suggests that directly supports your comment. Because, FWIW to my own FWIW, I actually agreed with the bulk of your comment, but I also felt the underlying position of said comment was only being presented in this way because you had not read through the blog post, and instead jumped straight in off the title alone, since, again, I felt nothing in the comment connected to the post beyond its title. Formatting is critical, which is why we should not rely on a static, mechanically fixed view of the world/code, and certainly not one decreed by "senior leads of sprints past" (or whatever the authority or popularity we are deferring to is on a given project or Tuesday). "Formatting" as a direct, mechanical act enforced either by a human at a keyboard pushing characters around, or by a linter following a style guide, is something that indeed "should" be "unnecessary", to elevate "formatting" (presentation) and make it a clear and important part of how we prepare code and make it more amenable to reading and understanding for our wetware, rather than convenient for fragile and lazy software. Why would we compromise on this now, when it could already be done in the 80s, and rely on static, linter enforced style rules at a time when we have so many cycles to spare on rendering code that we often render it in a web browser for the sake of "portability" (a huge irony given the origins of linters), and need not waste our time arguing over presentation when we could be making the presentations more useful to ourselves without concern for making it less useful for others, and then getting on with the actual task at hand? To me, this blog is all about elevating and prioritising formatting, without stamping on anyone's toes.

Still, to each their own. Oh, but that was kinda the point of the blog...

rho4 5 days ago|||

Not caring about formatting also signals to me that:

- they have probably never worked on a codebase where files are edited by more than 1 person

- they have never done any significant amount of merging between branches

- they have never maintained a large codebase

- they have never had to refactor a large codebase

- they don't use diff/comparison tools to read the history of their codebase

- they have never written any tooling for their codebase

- they are not good team-players and/or only care about their own stuff

pure-orange 5 days ago||

Did you not read the article?

KronisLV 5 days ago|||

If you are in circumstances where the answers to those questions are a resounding "No" then you should just set up the tooling to format the code on save / commit and perhaps to make the CI complain if anyone skips that and leave it at that.

Furthermore, instead of nitpicking over small details, it can actually be a good idea to just leave everything on default, forgo whatever your individual style might be and stick to what's been deemed to be good enough as the default - so the code will look more familiar to anyone who picks it up (and has used the tools you use for linting and formatting). Yes, formatting is different from linting; though if you set up one, you might as well do the other.

falcor84 5 days ago|||

The same personality attributes can be assessed even better based on penmanship, so going forward, I'll require all PRs to be submitted in cursive

chowells 5 days ago||

You know, my first job during college involved updating construction documents based on changes that were approved by both the contractors and the owners. Penmanship was critical when updating blueprints by hand - which was always a lot cheaper than getting the source documents, revising them, and reprinting them.

In my very limited experience, I learned the importance of penmanship in that profession.

In my much larger experience since, I've learned the irrelevance of penmanship to writing code. I don't practice my blueprint handwriting anymore. It would be wholly unfit-for-purpose without a bunch of practice. But I understand its value in that context.

If I understand the thrust of your comment correctly, you're pointing towards removing formatting as a channel being a net positive, despite the loss of all these indicators. I might almost agree with that, except for my point 4. Sometimes it's better, on the whole, to break conventions. Mechanical formatting systems cannot make these judgement calls.

I think the minor friction of explicit formatting is a net positive. I think the communication channel it adds carries more value than the friction it imposes hurts. (And I'm calling it explicit formatting because it doesn't have to be manual - it just has to be done with intention, judgement, and approval.)

I don't think the massive friction imposed by submitting code as ink on paper provides enough value to be worth its costs, by contrast.

linhns 5 days ago|||

I’d say you go re-read the article.

> The developer has the taste necessary to choose good formatting rules

Rely on this and you’re in trouble. More time will be lost just to argue which style is better. Go with the in-built formatter way of Go and Rust

PaulStatezny 5 days ago|||

You didn't read the blog.

It's talking about the Ada programming language and that its code was apparently stored not as plaintext but an intermediate representation (IR) that could then be transformed back into code.

So formatting was handled by tooling by the nature of the setup. Developers would each have their own custom settings for "pretty printing" the code.

The author isn't saying don't use code formatters. They're highlighting an unusual approach that the industry at large isn't aware of. Instead of getting rid of arguments about code style via formatters, you can get rid of them by saving code in an IR instead of plaintext.

shit_game 5 days ago|||

Would you say that someones code formatting is a shibboleth? How do you feel about formatters and linters in regards to this?

teaearlgraycold 5 days ago||

There are times when you really want a specific formatting of the text, like visually turning a list into a table.

rho4 5 days ago||

The system should support this, e.g. via // @formatter:off/on tags

teaearlgraycold 5 days ago||

For the stored IR version that means it needs to store raw source code when those directives are used. And then you lose the benefits.

aleph_minus_one 5 days ago||

Some (sometimes) desirable source code formatting cannot be deduced from the abstract syntax tree alone:

Consider the following (pseudo-)code example:

  bar.glob = 1;
  bar.plu.a1 = 21;
  bar.plu.coza = fol;

Should this code formatted this way? Or should it be formatted

  bar.glob     = 1;
  bar.plu.a1   = 21;
  bar.plu.coza = fol;

to emphasize that three assignments are done?

Or should this code be formatted

  bar.glob      = 1;
  bar.plu .a1   = 21;
  bar.plu .coza = fol;

to bring make the "depth" of the structure variables more tabular so that you can immediately see by the tabular shape which "depth" a member variable has?

We can go even further like

  bar.glob     =   1;
  bar.plu.a1   =  21;
  bar.plu.coza = fol;

which emphasizes that the author considers it to be very important that the reader can easily grasp the magnitudes of the numbers involved (which is why in Excel or LibreOffice Calc, numbers are right-aligned by default). Or combining this with making the depth "tabular":

  bar.glob      =   1;
  bar.plu .a1   =  21;
  bar.plu .coza = fol;

Each of these formattings emphasizes different aspects of the code that the author wants to emphasize. This information cannot be deduced from some abstract syntax tree alone. Rather, this needs additional information by the programmer in which sense the structure behind the code intended by the programmer is to be "interpreted".

kennywinker 5 days ago||

I see what you’re saying, but I also haven’t ever used anything but the first two formats, and my goal was always readability not emphasis.

Storing the AST instead of the text is a lossy encoding, but would we lose something more valuable than what we gain? If your example is the best thing we’d lose - i’d say it’s still net a massive win.

and there are ways to emphasize different parts, that would survive the roundtrip to AST. E.g. one way to emphasize depth:

    setValue([bar, glob], 1)

    setValue([bar, plu, a1], 21)

or to emphasize the data:

    configure(bar, 1, 21, fol)

Or heck you could allow style overides if you really wanted to preserve this kind of styling:

    // $formatblk: tabular_keypaths, aligned_assignments

    bar   .glob       = 1

    bar   .plu    .a1 = 21

    // $formatblk-end

Cthulhu_ 5 days ago|||

But "desirable code formatting" is subjective; some people prefer 2, 4 or 8 spaces, some prefer columnar layout like you demonstrated, etc. You can't deduce formatting from an AST alone as an AST is not source code and does not have formatting information.

gentooflux 5 days ago||

The second two lines of your example smell like LoD violations. It's not a formatting problem, it's a structural problem.

aleph_minus_one 5 days ago||

Sometimes you have to use libraries that are badly designed.

gentooflux 5 days ago||

When that happens they're usually badly formatted too.

aleph_minus_one 5 days ago||

Indeed, but this bad formatting should not "spill over" to your own code if possible.

jillesvangurp 5 days ago||

There was a movement towards working with syntax trees directly and treating source code as a generated serialization of those syntax trees about 20-25 years ago. This probably started with refactoring as it was pioneered in the nineties. Things like Visual Age actually stored code in a database instead of on the file system. Later intentional programming (Charles Simonyi was pushing that) tried to also do things with this. And of course model driven development was a thing around the same time.

Refactorings (when done right) are syntax tree transformations that preserve things like referential integrity, etc. that ensure code does the same thing before and after applying a refactoring.

A rename becomes trivial if you are simply working on the symbol directly. For that to work with file based source trees, you need to parse the whole thing, keep track of where symbols are referred in files, rename the symbol and then update all the places in the source tree. That stuff becomes a lot easier when the code representation isn't a bunch of files but the syntax tree. The symbol just gets a different name. Anything that uses the symbol will still use the same symbol.

People like editing files of course and that has resulted in a lot of friction developing richer tools that don't store text but something that preserves more structure. The fact that we're still going on about formatting issues a quarter century later maybe shows that this is something to revisit. For many languages and editors, robust symbol renames are still somewhat science fiction. And that's just the most basic refactoring.

zokier 5 days ago|

Meh.

> That stuff becomes a lot easier when the code representation isn't a bunch of files but the syntax tree

You are just mixing abstraction layers here. That syntax tree still needs to be stored in file(s) somehow, and nothing prevents having syntax tree aware (or smarter) tooling operating on human readable files. Basically deserializing AST and parsing source code are the same thing. The storage format really isn't that significant factor here.

So what is needed is better tools rather than fiddling with storage format. Microsofts Roslyn is obvious example, but plenty of modern compilers are moving in the direction of exposing APIs to interact with the codebase.

jillesvangurp 5 days ago||

> That syntax tree still needs to be stored in file(s) somehow

Sure, but there are less flaky ways than spreading a syntax tree across files. Visual Age actually used a database for this back in the day. Smalltalk did similar things by storing code in an image file that contained both byte code and method definitions. You could export source code if you wanted. But wouldn't do that while developing typically. That's not an approach that caught on. But it has some advantages.

What you are describing is what Eclipse did with Java. Eclipse was the successor to Visual Age. The Eclipse incremental compiler for Java updated an internal data structure for the IDE. It could do neat things as partial compilation to enable running tests even in the presence of some compile errors. It also was really fast. By the time you stopped typing, it would have already compiled your code. Running the tests was similarly fast.

The problem of syncing a tree of source files with an AST is just a bit hard. Intellij never came close to this and has always had lots of trouble keeping its internal caches coherent. There's even a top level "invalidate caches" option in the File menu (still there, I checked. Right next to the Repair IDE option). They were off by 2-3 orders of magnitude. Seconds (at best) instead of milliseconds. I still miss Eclipse's speed every day I use Intellij.

Some compilers are taking some steps to supporting more advanced IDEs. But there aren't a lot of those beyond what Jetbrains provides. VS Studio Code support varies between different languages. But mostly it's very limited on this front. The Rust compiler is one of those. Though I don't know the current state of that. Mostly it's not well known for its blazing performance (the compiler). I'm not sure if Jetbrains leverages many of those features in its Rust IDE (I'm not a Rust developer).

crq-yml 5 days ago||

I think the problem can be defined equally as: we can't invest in something more abstract than "plain text" at this time. When we try, it gets downgraded to a plain text projection of the syntax.

The plain text encoding itself exists in a process of incremental, path-dependent development from Morse Code signals to Unicode resulting in a "Gigantic Lookup Table" (GLUT, my coining) approach to symbolic comprehension. The assumption is useful - lots of features can "just work" by knowing that a particular bit pattern is always a particular symbol.

If we push up the abstraction level, we get a different set of symbols that are better suited to the app, but not equivalent GLUT tooling. Instead we usually get parsing of plain text as a transport. For example, CSV parsing. It is sloppy; it is also good enough.

Edit: XML is also a key example. It goes out of its way to respect the text transport approach. There are dedicated XML editors. But people want to edit it as plain text and they can't quite get there because funny-business with character encodings gets in the way, adding a bunch of ampersands and semicolons onto the symbols they want to edit. Thus we have ended up with "the CSV of hypertext documents", Markdown.

efortis 5 days ago||

Projectional Editing can be done with text sources.

Here’s an old video of JetBrains MPS rendering a table from code https://www.youtube.com/watch?v=XolJx4GfMmg&t=63s

I’m hoping for an IDE able to render dictionaries as tables -- my wishlist doesn’t stop there.

Currently, we have a glimpse of those features, such as code folding, inlay hints, or docstrings rendered as HTML:

https://x.com/efortis/status/1922427544470438381

banashark 5 days ago||

Interesting read. I’ve often wondered why the projection we see needs to be the same as the stored artifact. Even something like a git diff should be viewable via a projection of the source IR.

With things like treesitter and the like, I sometimes daydream about what an efficient and effective HCI for an AST or IR would look like.

Things like f#s ordered compilation often make code reviews more simple for me, but that’s because a piece of the intermediate form (dependency order) is exposed to me as a first class item. I find it much more simple to reason about compared to small changes in code with more lax ordering requirements, where I often find myself jumping up and down and back and forth in a diff and all the related interfaces and abstract classes and implementations to understand what effect the delta is having on the program as a whole.

rs186 5 days ago|

Ah, eslint-config-airbnb. My favorite airbnb config issues:

https://github.com/airbnb/javascript/issues/1271

https://github.com/airbnb/javascript/issues/1122

I literally spent over an hour when adapting an existing project to use the airbnb config, when code was perfectly correct, clear and maintainable. I ended up disabling those specific rules locally. I never used it in another project. (Looks like the whole project is no longer maintained. Good riddance.)

The airbnb config is, in my view, the perfect example of unnecessarily wasting people's productivity when linting is done badly.

More comments...