const passwordRules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];
async function createUser(user) {
const isUserValid = validateUserInput(user);
const isPasswordValid = user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password));
if (!isUserValid) {
throw new Error(ErrorCodes.USER_VALIDATION_FAILED);
}
if (!isPasswordValid) {
throw new Error(ErrorCodes.INVALID_PASSWORD);
}
const userExists = await userService.getUserByEmail(user.email);
if (userExists) {
throw new Error(ErrorCodes.USER_EXISTS);
}
user.password = await hashPassword(user.password);
return userService.create(user);
}
1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.2. Don't use the `a || throw()` structure. That is not idiomatic JS.
2a. Don't introduce `throwError()`. Again, not idiomatic JS.
3. Use an enum-like object for error codes for clarity.
4. If we must use passwordRules, at least extract it into a global constant. (I don't really like it though; it's a bit too clever. What if you want to enforce a password length minimum? Yes, you could hack a regex for that, but it would be hard to read. Much better would be a list of arrow functions, for instance `(password) => password.length > 8`.
5. Use TypeScript!
Then, in general (not seen here as there are no types), I like to use a lot of types in my code. The incoming user would be of type UnvalidatedUser, whereas the return type of this function would be StoredUser or something like that to distinguish the incoming user type with the outgoing. I like to attach semantics to a type, not to conditions: https://existentialtype.wordpress.com/2011/03/15/boolean-bli...
Error codes are there to hint to clients/callers what action they should potentially take (retry?, backoff)
Don't make callers handle 100s of different potential error codes/types.
If the whole internet can work with 70 your app can work with less.
All of google uses less than 20
https://github.com/googleapis/googleapis/blob/master/google/...
Put more specific information in the error message or a secondary status code.
I made the same in plpgsql recently and opted for returning an implicit union of UserSession | Error, by returning UserSession in the signature and raising errors in the function. The alternative was to return json where you'd have to look at the body of the function to figure out what it returns (when successful), as opposed to the signature.
I'm not sure if I'm striking the right balance. Yes, the signature is "self-documenting" - until you hit an error!
If you ain't expecting users to input bogus data, then you're putting way too much trust in said users.
Put simply: is it a bug in your own code if a user tries to use an invalid username? If yes, then throw an exception. If no, then return an error code. Exceptions represent programmer error; error codes represent user error.
No, they represent whatever suits the system design and the house style.
They're just tools to pass data and manipulate control flow.
If you become dogmatic and start insisting on what they must absolutely represent, you're only going to find yourself compromising design coherence and futily railing at colleagues when different representations are found to be appropriate.
With exceptions, it often makes sense to use them when failures inevitably just propogate up through a stack of nested calls or may appear anywhere in a sequence of calls. In these cases, return codes can grossly interfere with legibility and can lead to confusion when interim handlers squelch errors that really should have propogates up directly.
These are the kind of choices that are completely reasonable to make within a specific module (like an auth and accounts module), and you can always return to some other default policy at the interface of that module if your project/house-style needs such.
Engineering shouldn't feel so dogmatic as you suggest. It's a strong smell when it does.
Returning a “Result” with union discrimination on the “success” is far superior to throwing in Typescript in my experience
{ success: false, reason: “user_already_exists” } | { success: true, data: { user_id: string }
- By design you are forced to “deal” with the sad path: in order to get to the data, you must handle the error state- Throwing is not type-safe: you cannot know at build time whether you are handling all thrown errors
This is where the naming things bit comes in. You name the function correctly, then when the body is read to understand that it works as named, you can remove that cognitive complexity from your brain and continue on. Once you’ve built trust in the codebase that things do what they claim, you can start getting a top-down view of what the code does.
That is the power of proper abstraction.
Once a function gets to a certain threshold of size and reuse, I tend to agree with you. But I think that threshold is quite large - like at least 40-50 lines, or reused at least 3 times.
Also with that example, that is about the limit I would tolerate in a conditional that decides the next step in this app. I just need to know "if the password is valid, go to $SCREEN; if not, turn on these messages." I don't care about a string of expressions, I care about what it's doing and I don't want to parse the code every time to figure out "oh, this is just password validation."
This is different from verifying that features are implemented - now I need to know the minutia.
Any smaller function broken out of a larger function is (slightly) confusing. I find that a genuinely large function is much more confusing than two functions half the size, but a petite little function less than 20 lines long is not confusing to begin with.
You also know that the 'valid password' function is going to list the rules for a valid password. If you get a task to change the password creation rules, do you honestly expect people other than you to remember that code is in the createUser function and not the valid password function??
I don't think you're being honest with yourself about this scenario. Or if you are, this is going to cause you a lot of trouble over time.
I don't expect anyone to remember where the code is, regardless of which function it's in. I don't even expect them have been aware of every function to begin with.
How do you expect people will find the `isPasswordValid` function? Because I know from experience that the way I would find it would likely be by looking at the `createUser` function to see what it does. I might do a case-insensitive grep for the word password first, but even then I'd just have to go look at `createUser` anyway, otherwise I wouldn't know whether `isValidPassword` was dead code or referring to a different password.
I accept that if there was more logic then you might want to split it out, but seriously, it's tiny. The single responsibility principle doesn't justify making your production code much more difficult to read just so the unit tests around it are marginally easier to write.
The exceptions to this are situations where you need to do a bunch of sequential steps and cases where you are looping on stuff in higher dimensions.
I have gotten to the point that I consider a comment about the code to be a sign of trouble. (Comments about what the code is interacting with are a different matter.)
I don't understand why some people prefer being swarmed with details. It's not that they want details, but that they just hate navigating files (layer 8 + tooling problem) or that they "need" to know the details because not knowing them haunts them at night somehow. Also, not having that as a free function makes me think it's not tested at all (although there might be some integration test that hopefully catch all errors at once, but I'm sure they don't either)
> isPasswordValid
> Don't use a bunch of tiny functions
Exactly this. I only do that when the function is used in more than 10 places and it provides some extra clarity (like something as clamp(minVal,val,maxVal){return max(minVal,min(val,maxVal))} if your language doesn't already have it, of course).
I also apply that to variables though, everything that is only used once is inlined unless it really helps (when you create a variable, you need to remember it in case it is used afterwards, which for me is a hard task)
Wouldn’t that cause the regexes to be recompiled every time you call the function?
const isUserValid = ...
if(!isUserValid) ...
const isPasswordValid = ...
if(!isPasswordValid) ...
EtcThis is one of my pet peeves. I had an engineer recently wrap Dapper up in a bunch of functions. Like, the whole point of dapper to me is that it gets out of your way and lets you write very simple, declarative SQL database interactions and mapping. When you start wrapping it up in a bunch of function calls, it becomes opaque.
DRY has been taken as far too much gospel.
If you have nested functions, that's not a problem.
Btw, why do you use regular expressions for some rules, but not for others? Regular expressions are perfectly capable of expressing the length requirement.
Self-documenting code can work in a single team, particularly a small team with strong norms and shared knowledge. Over time as that team drifts, the shared knowledge will weaken, and the "self-documenting" code will no longer be self-documenting to the new team members.
!(await userService.getUserByEmail(user.email)) || throwError(err.userExists);
I guess if I worked in a codebase that used that pattern consistently I'd get used to it pretty quickly, but if I dropped into a new codebase that I didn't work on often I'd take a little bit longer to figure out what was going on. if (some_condition) {
// do stuff here
return;
}
// do other stuff here
is "better" than: if (some_condition) {
// do stuff here
} else {
// do other stuff here
}
If you have very-deeply nested code then it usually becomes easier to work with after splitting it up into smaller pieces. But IMO rewriting code like this to save a single level of indentation is bikeshedding.I have come to appreciate the style of early returns rather than else statements as I have found over the years it generally makes the code easier for me to follow when I’m looking at it possibly years later.
It really depends on the particular condition, but sometimes it just reads better to me to not use the else, and this is because as a style I tend to try have “fail conditions” cause an early return with a success being at the end of the method. But again there are regularly exceptions where trying to do this “just because” would contort the code, so returning an early success result happens often enough.
I have however found that sometimes ReSharper’s “avoid nesting” suggestion (particularly in examples like yours) results in less clear code, but it’s almost always at least not worse and maybe slightly better for the sake of consistency.
EDIT: Having thought about this more, here is why I find early returns generally easier to read than else statements.
With an early return the code is generally more linear to read as when I get to the end of the if block I can instantly see there is nothing else of relevance in the method, I save myself having to needlessly scan for the end of the else block, or even worse, past more code blocks only to find that the rest of the method’s code is irrelevant.
Again, not a hard rule, but a consistent style in a code base also generally makes it easier to read.
OTOH early returns are great for anything the type checker knows about:
function foo(msg: 'OK' | 'ERR') {
// ...
if (msg === 'ERR') return someValue
// ...
}
Doing that is hugely cleaner than branching, and there's no added complexity to the developer since tooling can easily tell you what values `msg` can have at any given point in the function.Only if all the cases return! Only then is it obvious that you have independent cases. E.g. suppose we have three Boolean inputs x, y, z and want to do something for each binary combination:
if (x) {
if (y) {
if (z) {
return 7;
} else {
return 6;
}
} else { // x && !y
if (z) {
return 5;
} else {
return 4;
}
}
} else { // !x
if (y) { // !x && y
if (z) {
return 3;
} else {
return 2;
}
} else { // !x && !y
if (z) {
return 1;
} else {
return 0;
}
}
}
How would this look with early returns? One obvious way: if (x && y && z)
return 7;
if (x && y && !z)
return 6;
if (x && !y && z)
return 5;
// ... etc
(Let's ignore that we can just calculate the output with some bit twiddling; that's not the point).Early return can be very clear, if we can bear repeatedly testing some conditions.
My comment was about using if blocks as opposed to early returns. I.e. where the nested ifs run exhaustively and return afterwards.
Also, obviously deep nested ifs aren't good, so I wasn't advocating them - I just think it's better to fix them by splitting functions or simplifying control flow, than by adding early returns.
However, setting that return variable can be recognized as a simulated return. If we know that after "ret = 42", there are no other state changes; that the whole if/else mess will drop out to the end where there is a "return ret", then we can just read it as "return 42".
There is no hard rule about early returns being always good or always bad, it depends on the particular situation.
> ..(as opposed to a subjective preference) is in my opinion super dangerous and reeks of cargo-culting..
o_O
You know that each early return completely handles its respective case; if that branch is taken, that's it; the function has ended. There is only way to reach the code past the if/return, which is that the condition has to fail.
The conditionals inside a function that has a single return are harder to read, because no conditional is necessarily final.
if/elses that all return can be readable:
if (this) {
if (that) {
return x;
} else {
return y;
}
} else {
return z;
}
still, it can be flattened: if (!this)
return z;
if (that)
return x;
return y;
It's shorter and less nested, and so that's a readability improvement. It's not as easy to see that x is returned when both this and that hold. The intermediate version below helps with that: if (this) {
if (that)
return x;
return y;
}
return z;
If the conditions are cheaply tested variables, or simple expressions easily optimized by the compiler, there is also: if (this && that)
return x;
if (this)
return y;
return z;
It’s not so much indentation that’s an issue, but coupling control flow with errors and exceptions.
Swift does a nice job with `guard` statements that basically bake this in at the language level - a condition succeeds or you must return or throw.
If that control flow is part of business logic, I don’t think there’s any issue with your second example. That’s what it’s there for.
I agree. The previous iteration shown is simpler IMO.
I've really shifted how I code to making things just plain simple to look at and understand.
user.password = await hashPassword(user.password);
1. As a rule, mutations are harder to understand than giving new names to newly defined values.2. The mutation here apparently modifies an object passed into the function, which is a side effect that callers might not expect after the function returns.
3. The mutation here apparently changes whether user.password holds a safe hashed password or a dangerous plain text password, which are bad values to risk mixing up later.
4. It’s not immediately obvious why hashing a password should be an asynchronous operation, but there’s nothing here to tell the reader why we need to await its result.
At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.
I do agree with many of the other comments here as well. However, I think the above is more serious, because it actually risks the program behaving incorrectly in various ways. Questions like whether to use guard clauses or extract the password check into its own function are more subjective, as long as the code is written clearly and correctly whichever choices are made.
Going that path further ends up what a few code bases I've worked with do: Pull the two domains apart into a "UserBeingCreated" and an existing "User".
This felt a bit weird at first, but the more I think about it, the more sense it makes. One point leaning towards this: You are dealing with different trust levels. One is a registered and hopefully somewhat validated user, which can be trusted a bit. The other thing could just be a drive by registration attempt.
And you're dealing with different properties. Sure, there is some overlap - username, mail, firstname, lastname. But only a UserBeingCreated needs validation errors or a clear text password. Other things - like groups, roles and other domain properties only make sense after the user is properly registered.
I wrote a more about this in a Reddit post a while back if anyone’s interested: https://www.reddit.com/r/Python/comments/16w97i6/flask_300_r...
Honestly, I think all of jsdoc, pydoc, javadoc, doxygen is stuff that most code should not use. The only code that should use these is code for libraries and for functions that are used by hundreds or thousands of other people. And then we also need to notice that these docs in comments are not sufficient for documentation either. When a function is not used by hundreds or thousands of people, just write a conventional comment or perhaps not write a comment at all if the function is quite straightforward. Documentation that explains the big picture is much more important but that is actually somewhat hard to write compared to sprinkling jsdoc, pydoc, javadoc or doxygen worthless shit all over the place.
if (cond) { cons }
on one line and get more readable code admittedly a few chars longer. if (cond) doSomething();
all the time. if (dontUseCurlies):
goto fail
goto fail
The kind of pattern used here with the `||` might seem weird to some JavaScript developers, but it's pretty normal in shell scripts, and it's pretty normal in Ruby with `unless`!
Shell scripts are NOT known for being easy to read. They're full of obscure and sometimes frankly bizarre, arcane syntax that newcomers would have no idea about. Quick, what does "$#" mean? An experienced bash programmer would know, but no one else would. Shell scripts were never meant to be easy to read; they're just an extension of the shell syntax, and of course vary a lot from shell to shell (e.g. bash vs zsh vs ksh etc.).
This would have been fine too but it would trigger some people not to use {}
if (!validateUserInput(user)) throwError(err.userValidationFailed);
My preferred style might be closer to this. if (!userInputIsValid(user)) throwError(err.userValidationFailed);
cheapFunction(...) || expensiveFunction(...) // saves us a few cylces
car = car || "bmw" // setting default values, common pattern
funcA(...) && funcB_WhichMightBreakWithoutFuncA(...) // func A implies func B
...
// probably a few other cases I don't remember
Using it to handle control flow (e.g. throwing exceptions, as a makeshift if-then, etc.) is a recipe for disaster. function insertion_sort(a) {
for (let i = 1; i < a.length; i++) {
let key = a[i];
let j = i;
while (j > 0 && key < a[j-1]) {
a[j] = a[j-1];
j--;
}
a[j] = key;
}
}
If short circuit evaluation didn't exist, then "key < a[j - 1]" would be evaluated even in the case where j = 0, leading to the array being indexed out of bounds.> cheapFunction(...) || expensiveFunction(...)
is not valid unless both functions return bool
> car = car || "bmw"
is not valid at all, because both types would need to be bool
> funcA(...) && funcB_WhichMightBreakWithoutFuncA(...)
not valid unless functions return bool. I think Go smartly realized this syntax is just sugar that causes more problems than it solves.
We can debate the merits of truthiness and using it this way, but let's have that debate on the merits, not by invoking other languages with completely different design constraints.
Your argument here is similar to what got us "no split infinitives" in English (grammarians wanted English to be more like Latin).
"Because Go doesn't support it" would also be a reason to avoid:
* Generics (at least until recently)
* Classes
* Prototypes
* Exceptions
* Async/Await
* Package managers (until recently)
* Algebraic data types
* Effect types
* Type inference
* Type classes
* Etc.
You could argue that one or more of these are footguns, but I seriously doubt you'd consider them all to be, so let's talk about what separates the footgun from the feature that just didn't fit in Go's design.
Something tells me you'd hate my five-line implementation of the standard library function fgets in C.
char *fgets(char *buf, size_t n, FILE *fp) {
char *p = buf;
int c = 0;
while (n > 1 && (c = getc(fp)) != EOF && (*p++ = c) != '\n')
n--;
return n > 0 && (c != EOF || feof(fp) && p != buf) ? *p = '\0', buf : 0;
}
I'm joking; I'd never write code this way in earnest.If you are able to formulate types which can be grasped without reading and re-reading their code - that's a win; and if you are able to express more of your code in terms of more generic non-trivial types and data structures, which are language-idiomatic - then that's a double win, because you're likely to be able to apply library functions directly, write less code, and have your code readers thinking "Oh, variadix is just doing FOO on the BAR structure, I know what that means".
What is an example of bespoke types? Is is all compound types (structs, classes)? If you need interop or extensibility, make an API. Feel free to use whatever types in your library you want, just make sure to agree to the API interface.
I’m honestly not sure what style of programming you are advocating for einpoklum. Can you provide an example?
All non-primitive types are essentially n-ary trees of types of sub-members (ignoring unions) with primitive types as leaves. Passing your non-primitive type so some external library function is accomplished by the library declaring that the type it receives must adhere to an interface.
Well, plain-old-data structs are not something I would call bespoke types. Classes for which you write code, I suppose.
> All non-primitive types are essentially
If you only use plain structs, then maybe.
do not create architectural lock-in. Friction in type declaration comes from too many cooks in the kitchen. If only two chunks of code see a type, just change it.
You do not have to adhere to any specific principles or methods or anything specific in order to do Code-as-Docs. Just write your code in a way that explains what it is doing, so that you don't need comments to understand it.
This often means refactoring your code to make it clearer what it does. It may not be what your ideal engineer brain wants the code to do, but it will make much more sense to anyone maintaining it. Plus very simple things like "variables-that-actually-describe-what-they-do" (in a loop over node names, don't make a variable called x; make a variable called node_name)
edit It seems like I'm the only one who says "Code-as-docs"... by searching for "Code-as-documentation" instead of "Code-as-docs", I found this: https://martinfowler.com/bliki/CodeAsDocumentation.html
I guess "self-documenting code" more hits: https://www.google.com/search?q=self-documenting+code https://en.wikipedia.org/wiki/Self-documenting_code https://wiki.c2.com/?SelfDocumentingCode
Naming or visual code structure was never a problem in my career so far.
Either you're the high verbal skill person on the project and you haven't noticed yet that everyone keeps coming to you to name things, you're in a pocket of devs with high verbal skills so you don't see problems, or you're in an echo chamber where everyone is equally bad and you don't know what 'good' looks like.
There's literally a programmer joke about how hard naming things is. If you don't understand a programmer joke you should probably pull on that thread real hard to figure out why.
Nah, not really. Years ago when we worked in office and I was still a junior we did discuss naming things here and there but I didn't like those conversations. It's all just opinions and habits.
> you're in a pocket of devs with high verbal skills
I mean, we don't hire juniors so it's possible but doubt it.
> you're in an echo chamber where everyone is equally bad
Most probably but if no one is having problems with reading code then I'm not sure what bad even means. If it's in English and it describes what it does and does not use any tricks - it absolutely doesn't matter how exactly is it worded and structured. The examples from the article are especially bad. I've seen some code I'd consider bad in general but it was all about those library hacks (usually ORMs) I mentioned previously or about trying to fit a lot in a single LINQ statement (i'm a c# dev). The only time when I'd consider naming to be bad was were all variable and function names were in German.
> There's literally a programmer joke about how hard naming things is.
It's a joke because it's not that serious. It might be hard sometimes to come up with any name for a thing (especially for non-english speakers like me) and you plop down a "var x" or something even less mature and run which is funny. So it is hard but it's not a "real" problem.
Those are for people who are new to programming.