Self-Documenting Code

Posted by tie-in 10/23/2024

Self-Documenting Code(lackofimagination.org)

72 points | 126 comments

johnfn 10/23/2024|

My cut:

    const passwordRules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];

    async function createUser(user) {
        const isUserValid = validateUserInput(user);
        const isPasswordValid = user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password));

        if (!isUserValid) {
            throw new Error(ErrorCodes.USER_VALIDATION_FAILED);
        }


        if (!isPasswordValid) {
            throw new Error(ErrorCodes.INVALID_PASSWORD);
        }

        const userExists = await userService.getUserByEmail(user.email);

        if (userExists) {
            throw new Error(ErrorCodes.USER_EXISTS);
        }

        user.password = await hashPassword(user.password);
        return userService.create(user);
    }

1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.

2. Don't use the `a || throw()` structure. That is not idiomatic JS.

2a. Don't introduce `throwError()`. Again, not idiomatic JS.

3. Use an enum-like object for error codes for clarity.

4. If we must use passwordRules, at least extract it into a global constant. (I don't really like it though; it's a bit too clever. What if you want to enforce a password length minimum? Yes, you could hack a regex for that, but it would be hard to read. Much better would be a list of arrow functions, for instance `(password) => password.length > 8`.

5. Use TypeScript!

OptionOfT 10/23/2024||

My issue with this is that you're using exceptions for control flow. A user not being valid is expected (duplicate username). A password not matching a regex is also expected.

Then, in general (not seen here as there are no types), I like to use a lot of types in my code. The incoming user would be of type UnvalidatedUser, whereas the return type of this function would be StoredUser or something like that to distinguish the incoming user type with the outgoing. I like to attach semantics to a type, not to conditions: https://existentialtype.wordpress.com/2011/03/15/boolean-bli...

aleksiy123 10/25/2024|||

I would add one suggestion/comment. Use a known set of standard error codes and not a unique error code/error type for each new situation.

Error codes are there to hint to clients/callers what action they should potentially take (retry?, backoff)

Don't make callers handle 100s of different potential error codes/types.

If the whole internet can work with 70 your app can work with less.

All of google uses less than 20

https://github.com/googleapis/googleapis/blob/master/google/...

Put more specific information in the error message or a secondary status code.

johnfn 10/24/2024||||

That's a great point, I didn't even think of that. I would use error types as well, yes.

jansommer 10/25/2024||||

It's a good point, especially because different return types is well supported in TS.

I made the same in plpgsql recently and opted for returning an implicit union of UserSession | Error, by returning UserSession in the signature and raising errors in the function. The alternative was to return json where you'd have to look at the body of the function to figure out what it returns (when successful), as opposed to the signature.

I'm not sure if I'm striking the right balance. Yes, the signature is "self-documenting" - until you hit an error!

rerdavies 10/24/2024|||

My issue with that is that absolutely NOTHING will ever convince me that returning error codes is a better idea than throwing exceptions. And that you seem to be using 'expected' in some weird cargo-culty sense of the word. An invalid user name is an error, not an expected case.

yellowapple 10/25/2024|||

> An invalid user name is an error, not an expected case.

If you ain't expecting users to input bogus data, then you're putting way too much trust in said users.

Put simply: is it a bug in your own code if a user tries to use an invalid username? If yes, then throw an exception. If no, then return an error code. Exceptions represent programmer error; error codes represent user error.

swatcoder 10/25/2024||

> Exceptions represent programmer error; error codes represent user error.

No, they represent whatever suits the system design and the house style.

They're just tools to pass data and manipulate control flow.

If you become dogmatic and start insisting on what they must absolutely represent, you're only going to find yourself compromising design coherence and futily railing at colleagues when different representations are found to be appropriate.

With exceptions, it often makes sense to use them when failures inevitably just propogate up through a stack of nested calls or may appear anywhere in a sequence of calls. In these cases, return codes can grossly interfere with legibility and can lead to confusion when interim handlers squelch errors that really should have propogates up directly.

These are the kind of choices that are completely reasonable to make within a specific module (like an auth and accounts module), and you can always return to some other default policy at the interface of that module if your project/house-style needs such.

Engineering shouldn't feel so dogmatic as you suggest. It's a strong smell when it does.

nip 10/25/2024|||

I’ll try:

Returning a “Result” with union discrimination on the “success” is far superior to throwing in Typescript in my experience

   { success: false, reason: “user_already_exists” } | { success: true, data: { user_id: string }

- By design you are forced to “deal” with the sad path: in order to get to the data, you must handle the error state

- Throwing is not type-safe: you cannot know at build time whether you are handling all thrown errors

jagged-chisel 10/23/2024|||

“ Don't use a bunch of tiny functions. This makes it harder for future eng to read the code …”

This is where the naming things bit comes in. You name the function correctly, then when the body is read to understand that it works as named, you can remove that cognitive complexity from your brain and continue on. Once you’ve built trust in the codebase that things do what they claim, you can start getting a top-down view of what the code does.

That is the power of proper abstraction.

johnfn 10/24/2024|||

I don't know. I really don't see any clarity improvements between, `user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password))` and `validatePassword(password)`. What if you want to add that the password must contain one special character? You don't actually know, by reading the name "validatePassword", if that work has already been done or not. You need to go read the function definition and check. And so even for such a small function, there is no name that you can choose to truly do what you claim and "remove the cognitive complexity from your brain".

Once a function gets to a certain threshold of size and reuse, I tend to agree with you. But I think that threshold is quite large - like at least 40-50 lines, or reused at least 3 times.

jagged-chisel 10/24/2024|||

`validatePassword` is definitely clearer. When I'm getting familiar with a new codebase, I don't need to know how each thing happens, I need to know what happens. In fact, in your example, you've already abstracted testing against the rules. (AAMOF, the password length requirement should be another rule...)

Also with that example, that is about the limit I would tolerate in a conditional that decides the next step in this app. I just need to know "if the password is valid, go to $SCREEN; if not, turn on these messages." I don't care about a string of expressions, I care about what it's doing and I don't want to parse the code every time to figure out "oh, this is just password validation."

This is different from verifying that features are implemented - now I need to know the minutia.

strken 10/24/2024||

It is definitely not clearer to me. I have no idea what happens in the `validatePassword` function. Does it make a synchronous network call that will stop the world for half a second? Does it throw an exception, or return an error object? I will also have to search the rest of the code to see who else calls it, and potentially refactor those callers as well.

Any smaller function broken out of a larger function is (slightly) confusing. I find that a genuinely large function is much more confusing than two functions half the size, but a petite little function less than 20 lines long is not confusing to begin with.

hinkley 10/24/2024||

The business rules for passwords and usernames are separate. It's okay for them to be separate methods.

You also know that the 'valid password' function is going to list the rules for a valid password. If you get a task to change the password creation rules, do you honestly expect people other than you to remember that code is in the createUser function and not the valid password function??

I don't think you're being honest with yourself about this scenario. Or if you are, this is going to cause you a lot of trouble over time.

strken 10/25/2024||

> If you get a task to change the password creation rules, do you honestly expect people other than you to remember that code is in the createUser function and not the valid password function??

I don't expect anyone to remember where the code is, regardless of which function it's in. I don't even expect them have been aware of every function to begin with.

How do you expect people will find the `isPasswordValid` function? Because I know from experience that the way I would find it would likely be by looking at the `createUser` function to see what it does. I might do a case-insensitive grep for the word password first, but even then I'd just have to go look at `createUser` anyway, otherwise I wouldn't know whether `isValidPassword` was dead code or referring to a different password.

lurking_swe 10/25/2024||

how do you unit test this function that’s handling many responsibilities?

strken 10/25/2024||

Look at it. It's under 20 lines long and handles one thing, which is creating a user. You unit test it by passing a user with an invalid password. You might possibly decide to put the validation of user input into one function and avoid mocking userService, but I don't see a good justification for splitting password and other user fields into separate functions.

I accept that if there was more logic then you might want to split it out, but seriously, it's tiny. The single responsibility principle doesn't justify making your production code much more difficult to read just so the unit tests around it are marginally easier to write.

LorenPechtel 10/24/2024|||

I used to feel that way, but over time I've come to realize that for the vast majority of code pulling stuff out so you have only one level of nesting and only one pass/fail decision makes it easier to understand years later. I'm sure the compiler inlines a lot of stuff I pulled out.

The exceptions to this are situations where you need to do a bunch of sequential steps and cases where you are looping on stuff in higher dimensions.

I have gotten to the point that I consider a comment about the code to be a sign of trouble. (Comments about what the code is interacting with are a different matter.)

dietr1ch 10/24/2024||||

Abstraction is way better, I don't really want to know how the password is validated unless I know I'm facing issues with validation (which proper logging tells you about before you even dive into the code).

I don't understand why some people prefer being swarmed with details. It's not that they want details, but that they just hate navigating files (layer 8 + tooling problem) or that they "need" to know the details because not knowing them haunts them at night somehow. Also, not having that as a free function makes me think it's not tested at all (although there might be some integration test that hopefully catch all errors at once, but I'm sure they don't either)

hinkley 10/24/2024||

It's not abstraction it's separation of concerns. And conflating the two is part of why GP is getting wrapped around the axle here.

hinkley 10/24/2024|||

He even knows what the missing function should be called:

> isPasswordValid

TrianguloY 10/23/2024|||

My only nitpick is that the const isPasswordValid = ... should be just before its use (between the first two ifs). Other than that, I prefer this approach (although I would inline the booleans in the ifs to avoid the one-use variables. But that's ok).

> Don't use a bunch of tiny functions

Exactly this. I only do that when the function is used in more than 10 places and it provides some extra clarity (like something as clamp(minVal,val,maxVal){return max(minVal,min(val,maxVal))} if your language doesn't already have it, of course).

I also apply that to variables though, everything that is only used once is inlined unless it really helps (when you create a variable, you need to remember it in case it is used afterwards, which for me is a hard task)

xigoi 10/23/2024||

> My only nitpick is that the const isPasswordValid = ... should be just before its use (between the first two ifs).

Wouldn’t that cause the regexes to be recompiled every time you call the function?

TrianguloY 10/23/2024||

I don't think so, if it does it will now too. In fact with my suggestion the regex checks will not run if the user is not valid (as it is now it will always run)

    const isUserValid = ... 
    if(!isUserValid) ...

    const isPasswordValid = ...
    if(!isPasswordValid) ...

Etc

rjbwork 10/24/2024|||

This is one of my pet peeves. I had an engineer recently wrap Dapper up in a bunch of functions. Like, the whole point of dapper to me is that it gets out of your way and lets you write very simple, declarative SQL database interactions and mapping. When you start wrapping it up in a bunch of function calls, it becomes opaque.

DRY has been taken as far too much gospel.

skydhash 10/25/2024||

I always prefer having loadBooks than “select * from books” everywhere. I prefer my code to be just the language version of how I would write an explanation if you ask me one specific question. Not for DRY, but for quick scanning and only diving into details when needed

rjbwork 10/26/2024||

Obviously there's a method called LoadBooks. The implementation of LoadBooks is what I'm talking about.

dclowd9901 10/24/2024|||

Yep, I’d hire you, and not OOP. There was some really questionable code structuring in there and I always wince at abstraction for abstraction’s sake. It always becomes a hindrance when something goes wrong, which is exactly when you don’t want to have a hard time deciphering code.

eru 10/25/2024|||

> 1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.

If you have nested functions, that's not a problem.

Btw, why do you use regular expressions for some rules, but not for others? Regular expressions are perfectly capable of expressing the length requirement.

ilrwbwrkhv 10/25/2024||

Something about Javascript and Typescript is really ugly to my eyes. I think it is the large keywords at the beginning of every line. Makes it hard to parse and read. I find C++ style much better to read.

alilleybrinker 10/23/2024||

There is no such thing as universally self-documenting code, because self-documentation relies on an assumption of an audience — what that audience knows, what patterns are comfortable for them — that does not exist in general.

Self-documenting code can work in a single team, particularly a small team with strong norms and shared knowledge. Over time as that team drifts, the shared knowledge will weaken, and the "self-documenting" code will no longer be self-documenting to the new team members.

simonw 10/23/2024||

I don't find this easier to read:

    !(await userService.getUserByEmail(user.email)) || throwError(err.userExists);

I guess if I worked in a codebase that used that pattern consistently I'd get used to it pretty quickly, but if I dropped into a new codebase that I didn't work on often I'd take a little bit longer to figure out what was going on.

mega_dean 10/23/2024||

After that step, they say "The resulting code is shorter and has no nested logic." The resulting code has the same logic as before, it's just not visually represented as being nested. I've seen the same argument ("nesting is bad so indentation is a code smell") used to say that it's better to use early returns and omit the `else` block, eg:

    if (some_condition) {
      // do stuff here
      return;
    }
    // do other stuff here

is "better" than:

    if (some_condition) {
      // do stuff here
    } else {
      // do other stuff here
    }

If you have very-deeply nested code then it usually becomes easier to work with after splitting it up into smaller pieces. But IMO rewriting code like this to save a single level of indentation is bikeshedding.

jonathanlydall 10/23/2024|||

I would say (as all good technical people know) it depends.

I have come to appreciate the style of early returns rather than else statements as I have found over the years it generally makes the code easier for me to follow when I’m looking at it possibly years later.

It really depends on the particular condition, but sometimes it just reads better to me to not use the else, and this is because as a style I tend to try have “fail conditions” cause an early return with a success being at the end of the method. But again there are regularly exceptions where trying to do this “just because” would contort the code, so returning an early success result happens often enough.

I have however found that sometimes ReSharper’s “avoid nesting” suggestion (particularly in examples like yours) results in less clear code, but it’s almost always at least not worse and maybe slightly better for the sake of consistency.

EDIT: Having thought about this more, here is why I find early returns generally easier to read than else statements.

With an early return the code is generally more linear to read as when I get to the end of the if block I can instantly see there is nothing else of relevance in the method, I save myself having to needlessly scan for the end of the else block, or even worse, past more code blocks only to find that the rest of the method’s code is irrelevant.

Again, not a hard rule, but a consistent style in a code base also generally makes it easier to read.

fenomas 10/25/2024|||

It definitely depends, but personally I find early returns to be a bit of an antipattern IF they're based on business logic. If a function has lots of ifs, you can glance at the nesting to see which ones affect the line you want to edit, and ignore the others. But if the function has lots of returns, you have to check every one before a given line in order to know what constraints are true at that point.

OTOH early returns are great for anything the type checker knows about:

    function foo(msg: 'OK' | 'ERR') {
        // ...
        if (msg === 'ERR') return someValue
        // ...
    }

Doing that is hugely cleaner than branching, and there's no added complexity to the developer since tooling can easily tell you what values `msg` can have at any given point in the function.

kazinator 10/25/2024|||

> you can glance at the nesting to see which ones affect the line you want to edit, and ignore the others.

Only if all the cases return! Only then is it obvious that you have independent cases. E.g. suppose we have three Boolean inputs x, y, z and want to do something for each binary combination:

  if (x) {
    if (y) {
      if (z) {
         return 7;
      } else {
         return 6;
      }
    } else { // x && !y
      if (z) {
         return 5;
      } else {
         return 4;
      }
    }
  } else { // !x
    if (y) { // !x && y
      if (z) {
         return 3;
      } else {
         return 2;
      }
    } else { // !x && !y
      if (z) {
         return 1;
      } else {
         return 0;
      }
    }    
  }

How would this look with early returns? One obvious way:

  if (x && y && z)
    return 7;
  if (x && y && !z)
    return 6;
  if (x && !y && z)
    return 5;
  // ... etc

(Let's ignore that we can just calculate the output with some bit twiddling; that's not the point).

Early return can be very clear, if we can bear repeatedly testing some conditions.

fenomas 10/25/2024||

> Only if all the cases return!

My comment was about using if blocks as opposed to early returns. I.e. where the nested ifs run exhaustively and return afterwards.

Also, obviously deep nested ifs aren't good, so I wasn't advocating them - I just think it's better to fix them by splitting functions or simplifying control flow, than by adding early returns.

kazinator 10/25/2024||

So it is more about multiple returns, versus setting some local return variable and returning in one place.

However, setting that return variable can be recognized as a simulated return. If we know that after "ret = 42", there are no other state changes; that the whole if/else mess will drop out to the end where there is a "return ret", then we can just read it as "return 42".

fenomas 10/25/2024||

Sure, in the narrow case where the function only calculates a single return value and has no side effects.

kazinator 10/25/2024||

Or where it produces an effect (or effect group) in every case just before returning, without multiple effects interspersed among multiple condition tests.

fenomas 10/26/2024||

That's isomorphic to what I said, so... also yes :D

jonathanlydall 10/25/2024|||

Calling it an anti pattern (as opposed to a subjective preference) is in my opinion super dangerous and reeks of cargo-culting as it implies an active avoidance of it which can result in deep nesting or contorted and hard to read logic.

There is no hard rule about early returns being always good or always bad, it depends on the particular situation.

fenomas 10/25/2024|||

> > It definitely depends, but personally I find..

> ..(as opposed to a subjective preference) is in my opinion super dangerous and reeks of cargo-culting..

o_O

aidos 10/24/2024||||

I call them guard conditions and it’s my preferred pattern too. Deal with all the exceptional cases at the top and then move on to the happy path as a linear flow.

hollerith 10/24/2024||

And at the start of the happy path, I tend to put "Do it" as a comment (a convention I learned from reading Emacs Lisp code).

kazinator 10/25/2024|||

Early returns are easy to read because although return is a staunchly imperative construct (a form of "go to"), early returns are structured such that they simulate a multi-case conditional from a functional language.

You know that each early return completely handles its respective case; if that branch is taken, that's it; the function has ended. There is only way to reach the code past the if/return, which is that the condition has to fail.

The conditionals inside a function that has a single return are harder to read, because no conditional is necessarily final.

if/elses that all return can be readable:

  if (this) {
    if (that) {
      return x;
    } else {
      return y;
    }
  } else {
    return z;
  }

still, it can be flattened:

  if (!this)
    return z;
  if (that)
    return x;
  return y;

It's shorter and less nested, and so that's a readability improvement. It's not as easy to see that x is returned when both this and that hold. The intermediate version below helps with that:

  if (this) {
    if (that)
      return x;
    return y;
  }

  return z;

If the conditions are cheaply tested variables, or simple expressions easily optimized by the compiler, there is also:

  if (this && that)
    return x;
  if (this)
    return y;
  return z;

jonhohle 10/23/2024|||

I usually find the opposite (personally). Get rid of all the exceptional cases and error handling up front like you have in the first example and then spend the remaining body of the function doing the main work of that function.

It’s not so much indentation that’s an issue, but coupling control flow with errors and exceptions.

Swift does a nice job with `guard` statements that basically bake this in at the language level - a condition succeeds or you must return or throw.

If that control flow is part of business logic, I don’t think there’s any issue with your second example. That’s what it’s there for.

RaftPeople 10/23/2024|||

> I don't find this easier to read:

I agree. The previous iteration shown is simpler IMO.

I've really shifted how I code to making things just plain simple to look at and understand.

f1yght 10/23/2024||

That's the way it should be, easy to understand. This set up might be short but it's complex to read.

amonith 10/23/2024|||

It could be "dangerous" even sometimes if you're not paying attention. In JS/TS "||" operator evaluates the right side when the left side is "falsy". "Falsy" doesn't mean only null/undefined, but also "", 0, NaN, and... well... false. So if you make a method like "isUserActive" or "getAccountBalance" and do a throw like that, you'll get an error for valid use cases.

jay_kyburz 10/23/2024||

Whats more, the isUserValid function can't return a more detailed error about what was not valid about the User. It can only return falsy.

LorenPechtel 10/24/2024||

Agreed. I do not like that line at all. I might take that approach but if I did it would be a separate IsDataValid function that checked things, one condition per line. (Might be a string of ||s, but never run together like that.) As much as possible I want one line to do one thing only.

Chris_Newton 10/25/2024||

If I were reviewing the original code, the first thing I’d question is the line

    user.password = await hashPassword(user.password);

1. As a rule, mutations are harder to understand than giving new names to newly defined values.

2. The mutation here apparently modifies an object passed into the function, which is a side effect that callers might not expect after the function returns.

3. The mutation here apparently changes whether user.password holds a safe hashed password or a dangerous plain text password, which are bad values to risk mixing up later.

4. It’s not immediately obvious why hashing a password should be an asynchronous operation, but there’s nothing here to tell the reader why we need to await its result.

At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.

I do agree with many of the other comments here as well. However, I think the above is more serious, because it actually risks the program behaving incorrectly in various ways. Questions like whether to use guard clauses or extract the password check into its own function are more subjective, as long as the code is written clearly and correctly whichever choices are made.

tetha 10/25/2024||

> At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.

Going that path further ends up what a few code bases I've worked with do: Pull the two domains apart into a "UserBeingCreated" and an existing "User".

This felt a bit weird at first, but the more I think about it, the more sense it makes. One point leaning towards this: You are dealing with different trust levels. One is a registered and hopefully somewhat validated user, which can be trusted a bit. The other thing could just be a drive by registration attempt.

And you're dealing with different properties. Sure, there is some overlap - username, mail, firstname, lastname. But only a UserBeingCreated needs validation errors or a clear text password. Other things - like groups, roles and other domain properties only make sense after the user is properly registered.

Chris_Newton 10/25/2024||

I’ve had this debate a few times too. Personally I am in the camp that says you’re talking about two interfaces — your external UI or API, and your internal database schema — so even though you’ll often have a lot of overlap between types representing analogous entities in those two interfaces, they aren’t really the same concept and coding as if they will or should always have identical representations is a trap. I would almost always prefer to define distinct types and explicit conversion between them, even though it’s somewhat more verbose, and the password hashing here is a good example of why.

I wrote a more about this in a Reddit post a while back if anyone’s interested: https://www.reddit.com/r/Python/comments/16w97i6/flask_300_r...

jcparkyn 10/25/2024||

Agreed, and then there's the time of check/time of use issue with creating a user. Probably not a vulnerability if userService is designed well, but still a bit dubious.

Chris_Newton 10/25/2024||

You’re right, that’s potentially a correctness issue as well. Ideally we’d have a creation interface that would also perform the pre-existence check atomically, so there would be no need for the separate check in advance and the potential race condition would not exist. This does depend on the user service providing a convenient interface like that, though, and alas we aren’t always that lucky.

cjfd 10/23/2024||

Typescript looks much, much better than what he ends up with. The typescript is more or less the same thing but with comment tokens removed. How is just removing the comment tokens not an obvious improvement in readability?

Honestly, I think all of jsdoc, pydoc, javadoc, doxygen is stuff that most code should not use. The only code that should use these is code for libraries and for functions that are used by hundreds or thousands of other people. And then we also need to notice that these docs in comments are not sufficient for documentation either. When a function is not used by hundreds or thousands of people, just write a conventional comment or perhaps not write a comment at all if the function is quite straightforward. Documentation that explains the big picture is much more important but that is actually somewhat hard to write compared to sprinkling jsdoc, pydoc, javadoc or doxygen worthless shit all over the place.

joecarrot 10/23/2024||

If one of my developers used "||" that way I would definitely throw some side eye

JellyBeanThief 10/23/2024||

I was thinking exactly the same. You can write

    if (cond) { cons }

on one line and get more readable code admittedly a few chars longer.

65 10/23/2024|||

Don't even need the curly braces. I do

    if (cond) doSomething();

all the time.

gnarlouse 10/23/2024||

if (heathen) dontUseCurlies();

Groxx 10/23/2024||

    if (dontUseCurlies):
      goto fail
      goto fail

gnarlouse 10/23/2024||

go to jail

alilleybrinker 10/23/2024|||

Code patterns are social! What is strange to one is normal to another.

The kind of pattern used here with the `||` might seem weird to some JavaScript developers, but it's pretty normal in shell scripts, and it's pretty normal in Ruby with `unless`!

mmastrac 10/23/2024|||

`||` is conventional in bash, but definitely not in JS. `x || value` is closer to conventional JS, but using it for control flow is certainly not common and a stylistic choice, for sure.

shiroiushi 10/25/2024|||

>The kind of pattern used here with the `||` might seem weird to some JavaScript developers, but it's pretty normal in shell scripts

Shell scripts are NOT known for being easy to read. They're full of obscure and sometimes frankly bizarre, arcane syntax that newcomers would have no idea about. Quick, what does "$#" mean? An experienced bash programmer would know, but no one else would. Shell scripts were never meant to be easy to read; they're just an extension of the shell syntax, and of course vary a lot from shell to shell (e.g. bash vs zsh vs ksh etc.).

MetaWhirledPeas 10/23/2024|||

I must be their target audience because as soon as they used the example with || it all started making sense.

This would have been fine too but it would trigger some people not to use {}

    if (!validateUserInput(user)) throwError(err.userValidationFailed);

My preferred style might be closer to this.

    if (!userInputIsValid(user)) throwError(err.userValidationFailed);

gnarlouse 10/23/2024||

If one of my developers threw in a throw like that I would throw up in their mouth.

dvt 10/23/2024||

The writer here misunderstands how short-circuit evaluation is supposed to be used. The idea is that you should use SCE in a few, pretty standard, cases:

    cheapFunction(...) || expensiveFunction(...) // saves us a few cylces
    car = car || "bmw" // setting default values, common pattern
    funcA(...) && funcB_WhichMightBreakWithoutFuncA(...) // func A implies func B
    ...
    // probably a few other cases I don't remember

Using it to handle control flow (e.g. throwing exceptions, as a makeshift if-then, etc.) is a recipe for disaster.

trealira 10/23/2024||

Short-circuiting evaluation is also useful for things like this:

  function insertion_sort(a) {
      for (let i = 1; i < a.length; i++) {
          let key = a[i];
          let j = i;
          while (j > 0 && key < a[j-1]) {
              a[j] = a[j-1];
              j--;
          }
          a[j] = key;
      }
  }

If short circuit evaluation didn't exist, then "key < a[j - 1]" would be evaluated even in the case where j = 0, leading to the array being indexed out of bounds.

38 10/23/2024||

I would go further to say that syntax should never be used. for example with Go:

> cheapFunction(...) || expensiveFunction(...)

is not valid unless both functions return bool

> car = car || "bmw"

is not valid at all, because both types would need to be bool

> funcA(...) && funcB_WhichMightBreakWithoutFuncA(...)

not valid unless functions return bool. I think Go smartly realized this syntax is just sugar that causes more problems than it solves.

lolinder 10/23/2024||

This has nothing to do with syntax and short circuiting and everything to do with Go's type system. Go, like most compiled languages, has no concept of "truthiness". JavaScript is not Go and has truthiness.

We can debate the merits of truthiness and using it this way, but let's have that debate on the merits, not by invoking other languages with completely different design constraints.

Your argument here is similar to what got us "no split infinitives" in English (grammarians wanted English to be more like Latin).

38 10/23/2024||

just because a footgun exists, doesn't mean you should use it

lolinder 10/23/2024||

Then let's talk—with specifics!—about why it's a footgun and we shouldn't use it.

"Because Go doesn't support it" would also be a reason to avoid:

* Generics (at least until recently)

* Classes

* Prototypes

* Exceptions

* Async/Await

* Package managers (until recently)

* Algebraic data types

* Effect types

* Type inference

* Type classes

* Etc.

You could argue that one or more of these are footguns, but I seriously doubt you'd consider them all to be, so let's talk about what separates the footgun from the feature that just didn't fit in Go's design.

38 10/23/2024||

This isn't about Go. This is about a language construct (&& and ||) that I would argue is terrible and should never be used. Its mainly just sugar, and in my experience code heavy on these is harder to read and write. Couple that with truthiness and it's even worse.

trealira 10/25/2024||

> This is about a language construct (&& and ||) that I would argue is terrible and should never be used.

Something tells me you'd hate my five-line implementation of the standard library function fgets in C.

  char *fgets(char *buf, size_t n, FILE *fp) {
      char *p = buf;
      int c = 0;
      while (n > 1 && (c = getc(fp)) != EOF && (*p++ = c) != '\n')
          n--;
      return n > 0 && (c != EOF || feof(fp) && p != buf) ? *p = '\0', buf : 0;
  }

I'm joking; I'd never write code this way in earnest.

variadix 10/23/2024||

Types are the best form of documentation because they can be used to automatically check for user error, are integral to the code itself, and can provide inline documentation. The more I program in dynamically typed (or even weakly statically typed) languages the more I come to this conclusion.

einpoklum 10/24/2024|

You're not wrong, but if your code is full of bespoke types that are only relevant to a couple of places where they're used, you hurt interoperability; you lock yourself in to how things are done now; and you may just be shifting the burden of making sense to someplace else in the code.

If you are able to formulate types which can be grasped without reading and re-reading their code - that's a win; and if you are able to express more of your code in terms of more generic non-trivial types and data structures, which are language-idiomatic - then that's a double win, because you're likely to be able to apply library functions directly, write less code, and have your code readers thinking "Oh, variadix is just doing FOO on the BAR structure, I know what that means".

callc 10/24/2024|||

> if your code is full of bespoke types that are only relevant to a couple of places where they're used, you hurt interoperability; you lock yourself in to how things are done now; and you may just be shifting the burden of making sense to someplace else in the code

What is an example of bespoke types? Is is all compound types (structs, classes)? If you need interop or extensibility, make an API. Feel free to use whatever types in your library you want, just make sure to agree to the API interface.

I’m honestly not sure what style of programming you are advocating for einpoklum. Can you provide an example?

All non-primitive types are essentially n-ary trees of types of sub-members (ignoring unions) with primitive types as leaves. Passing your non-primitive type so some external library function is accomplished by the library declaring that the type it receives must adhere to an interface.

einpoklum 10/25/2024||

> Is is all compound types (structs, classes)?

Well, plain-old-data structs are not something I would call bespoke types. Classes for which you write code, I suppose.

> All non-primitive types are essentially

If you only use plain structs, then maybe.

hinkley 10/24/2024|||

> types that are only relevant to a couple of places

do not create architectural lock-in. Friction in type declaration comes from too many cooks in the kitchen. If only two chunks of code see a type, just change it.

0xbadcafebee 10/23/2024||

"Self-documenting code" is already a thing called Code-as-Docs. It's the inverse of Docs-as-Code, where you're "writing documentation like you write code". Code-as-Docs is where you write Code that is self-documenting. (And this has absolutely nothing to do with Literate Programming.)

You do not have to adhere to any specific principles or methods or anything specific in order to do Code-as-Docs. Just write your code in a way that explains what it is doing, so that you don't need comments to understand it.

This often means refactoring your code to make it clearer what it does. It may not be what your ideal engineer brain wants the code to do, but it will make much more sense to anyone maintaining it. Plus very simple things like "variables-that-actually-describe-what-they-do" (in a loop over node names, don't make a variable called x; make a variable called node_name)

edit It seems like I'm the only one who says "Code-as-docs"... by searching for "Code-as-documentation" instead of "Code-as-docs", I found this: https://martinfowler.com/bliki/CodeAsDocumentation.html

I guess "self-documenting code" more hits: https://www.google.com/search?q=self-documenting+code https://en.wikipedia.org/wiki/Self-documenting_code https://wiki.c2.com/?SelfDocumentingCode

Izkata 10/25/2024|

I think "self-documenting code" is older than those other two short terms. I, at least, don't think I've ever heard of them, but I was aware of self-documenting code around 20 years ago in school.

amonith 10/23/2024|

After 10 years as a commercial dev I've noticed I don't really care about things like this. Not sure if it ever made a difference. The "local code" - as in anything within a function or often a single class (1-2k LoC is not really a problem) - is trivial to read in most languages. The most difficult thing to understand always was the domain or the infrastructure/library quirks - stuff that's never properly documented. (Hot take: might not be worth to document anyway as it takes longer to write and update such docs than to struggle with the code for a little bit).

Naming or visual code structure was never a problem in my career so far.

hinkley 10/24/2024|

> Naming or visual code structure was never a problem in my career so far.

Either you're the high verbal skill person on the project and you haven't noticed yet that everyone keeps coming to you to name things, you're in a pocket of devs with high verbal skills so you don't see problems, or you're in an echo chamber where everyone is equally bad and you don't know what 'good' looks like.

There's literally a programmer joke about how hard naming things is. If you don't understand a programmer joke you should probably pull on that thread real hard to figure out why.

amonith 10/25/2024|||

> everyone keeps coming to you to name things

Nah, not really. Years ago when we worked in office and I was still a junior we did discuss naming things here and there but I didn't like those conversations. It's all just opinions and habits.

> you're in a pocket of devs with high verbal skills

I mean, we don't hire juniors so it's possible but doubt it.

> you're in an echo chamber where everyone is equally bad

Most probably but if no one is having problems with reading code then I'm not sure what bad even means. If it's in English and it describes what it does and does not use any tricks - it absolutely doesn't matter how exactly is it worded and structured. The examples from the article are especially bad. I've seen some code I'd consider bad in general but it was all about those library hacks (usually ORMs) I mentioned previously or about trying to fit a lot in a single LINQ statement (i'm a c# dev). The only time when I'd consider naming to be bad was were all variable and function names were in German.

> There's literally a programmer joke about how hard naming things is.

It's a joke because it's not that serious. It might be hard sometimes to come up with any name for a thing (especially for non-english speakers like me) and you plop down a "var x" or something even less mature and run which is funny. So it is hard but it's not a "real" problem.

reportgunner 10/25/2024|||

> There's literally a programmer joke about how hard naming things is. If you don't understand a programmer joke you should probably pull on that thread real hard to figure out why.

Those are for people who are new to programming.

More comments...