Posted by lukastyrychtr 5 days ago
It makes no more sense to me for "return <expr>" to have a type than it does to make "if <expr>" or "break" or "{" or any other keyword to have a type. These are syntactic elements.
Rust's type system is clearly inspired by Hindley-Milner and most languages using such a type system either don't even have a return keyword.
Even if you disagree with this argument, this design decision has resulted in all these weird/confusing but absolutely useless code examples and there is no upside that I can see to this decision in terms of language ergonomics. What practical value is it to users to allow "return <expr>" to itself be an expression? That you can use such an "expression" as arguments to function calls with hilarious wtf consequences? It's a piece of syntactic sugar.
> don't even have a return keyword.
This is because they are not procedural languages, it has nothing to do with the type system.
> there is no upside that I can see to this decision in terms of language ergonomics.
There's tremendous upside! That's why lots of languages choose this. For example, there is no need for the ternary in Rust: if can just do that.
> What practical value is it to users to allow "return <expr>" to itself be an expression?
Code like this just works:
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => return,
};
That is, if return wasn't an expression, we'd have a type error: the two arms would have incompatible types.Assigning values of expressions that are purely `never`, or having values that are purely `never` or `()` as the condition in a conditional operator, should be marked as an error, like unreachable code.
So the syntactic element "return" is not just an expression - unlike other sub-expressions, it involves action at a distance - i.e. it must not just agree with it's context as part of an expression but it must agree with the enclosing fn signature.
let y = match option { Some(x) => x, None => return Err("whoops!"), };
Without a type, the None branch loses the ability to unify with the Some branch. Now you could say that Rust should just only require branches’ types to unify when all of them have a type, but the ! never type accomplishes that goal just fine.
In your particular example, let's put your example into a context. Is
fn foo(option: Option<i32>) -> i32 {
let y = match option { Some(x) => x, None => return Err("whoops!"), };
return 1;
}
well typed? It should be if we are to believe that "return <expr>" is an expression of type () - but, naturally, it causes a compilation error because the compiler specifically treats "return <expr>" unlike other expressions. So there is no improvement in regularity, while it admits all sorts of incomprehensible "puzzlers".I don't see why you'd lose this ability if you removed the claim that "return <expr>" is itself an expression. Most/many languages have mechanisms to allow expressions to affect flow control - e.g. with exceptions, yield, etc. - which do not these constructs (for example "throw x") to have a type.
Rust could just as easily supported the syntax you use above without making "return <expr>" a tapeable expression.
It's not, but not due to the return, it's because you're trying to return a Result from a function that returns an i32. This works:
fn foo(option: Option<i32>) -> Result<i32, &'static str> {
let y = match option { Some(x) => x, None => return Err("whoops!"), };
return Ok(1);
}
> It should be if we are to believe that "return <expr>" is an expression of type ()It is not, it is an expression of type !. This type unifies with every other type, so the overall type of y is i32. return is not treated in a special way.
> if you removed the claim that "return <expr>" is itself an expression
This code would no longer work, because blocks that end in an expression evaluate to (), and so you would get the divergent, not well typed error, because one arm is i32 and the other is ().
"It's not, but not due to the return, it's because you're trying to return a Result from a function that returns an i32."
That's exactly my point. "return <expr>" is not just an expression which can be typed. If you tell me the types of all the identifiers used, I can look at any expression in Rust which does not include a return, and tell you if it's well typed or not. If the expression includes a return, then I cannot tell you whether the expression is well-formed.
Yes, it is, and it can. It has the type !, no matter the type of <expr>.
For any expression NOT involving "return", I can write, for example:
const Z = <expr>
but I cannot if <expr> contains a return embedded somewhere. The existence of a "return" somewhere in an expression changes the character of the entire expression.
I.e. there are two classes of "expressions". Those NOT containing returns (which are equivalent to the notion of "expression" in the languages that Rust was inspired by) and those containing a return somewhere in them which are subject to further rules about wellformedness.
My point is that none of this is necessary at all - you don't need to provide type rules for every lexical feature of your language to have a language with a powerful expressive type system (like Rust's).
> const Z = <expr>
> but I cannot if <expr> contains a return embedded somewhere.*
Sure, but that's not special about this case at all. I also can't write 'break' or 'continue' when I'm not inside a loop. When declaring a 'const', I am lexically not inside a function body, so I can't use 'return', which makes sense (the compiler will even tell you, "return statement outside of function body").
Particular statements being allowed in some contexts but not in others is entirely normal.
> My point is that none of this is necessary at all
Maybe it's not necessary, but I like the consistency this provides ("everything has a type"), and I imagine the implementation of the type checker/inferer is more straightforward this way.
Sure, you could define the language such that "a 'return' in a position that expects a typed expression will not affect other type that need to match with it" (or something else, in better, formal language). Or you can just define those statements to have the 'never' type, and not worry about it.
But ok, let's agree that it's not necessary. Then we're just talking about personal preferences, so there's no right or wrong here, and there's no point in arguing.
And this isn't really any different from variable references, if you think about it. If you have an expression (x + 1), you can only use it somewhere where there's an `x` in scope. Similarly, you can only use `return` somewhere where there's a function to return from in scope. Indeed, you could even make this explicit when designing the language! A function definition already introduces implicit let-definitions for all arguments in the body. Imagine if we redefined it such that it also introduces "return" as a local, i.e. given:
fn foo(x: i32, y: i32) -> i32 {
...
}
the body of the function is written as if it had these lines prepended: let x = ...;
let y = ...;
let return = ...;
...
where "return" is a function that does the same thing as the statement. And similarly for break/continue and loops.The thing that actually makes these different from real variables is that they cannot be passed around as first-class values (e.g. having the function pass its "return" to another function that it calls). Although this could in fact be done, and with Rust lifetime annotations it would even be statically verifiable.
You can't, actually: 'const' is special in that it's not considered by the compiler to be inside a function definition, even if it is (and the compiler will tell you, "return statement outside of function body").
But that doesn't invalidate your point; in a way it supports it: 'return' can only be used in function contexts, just like 'continue' or 'break' can only be used in loop contexts.
const ONE: i32 = { const fn foolish() -> i32 { return 1 } foolish() };
But yes, your larger point is exactly correct, the constant, even if it happens to be defined inside a function body, is not itself inside a function body and so we obviously can't return from it. It is also not inside an expression we can break out of (Rust allows you to break out of any expression, not just loops). It's a constant, like 5 is a constant, or 'Z' is a constant - this is not C or C++ where "const" means "actually a variable".Also, the type of return is a separate matter from the type of the thing being returned. You obviously can't return Result from a function returning i32. The point of type coercion is that you can yield `return Err(...)` in one branch of a match and have it type check with the other branch(es).
e: T is well typed _if_ the end result of e would be of type T
(end result being hand-wave-y)
It's not a guarantee that e is a value of a certain type, but a guarantee that if e is a value in the first place, then it will be a certain type. You sidestep having to prove the halting nature of e.
This leaves a nice spot for computation that doesn't complete!
let y = return 1
f(y)
y could be any type, and it's well typed, because you're never in a secnario where f(y) will be provided a value of the wrong type.Well-typed-ness, by my understanding in more complex type system, is not a guarantee of control flow, but a guarantee that _if_ we evaluate some expression, then it will be fine.
And so... you can put `!` as a type in your system, treat return as an expression, and have a simpler semantic model, without really losing anything. Less moving parts, etc.... that's my read of it anyways.
let day_number = match name {
"Sunday" => 0,
"Monday" => 1,
"Tuesday" => 2,
"Wednesday" => 3,
"Thursday" => 4,
"Friday" => 5,
"Saturday" => 6,
_ => return Err("invalid day")
};
It makes generic code work without need to add exceptions for "syntactic elements". You can have methods like `map(callback)` that take a generic `fn() -> T` and pass through `T`. This can work uniformly for functions that do return values as well as for functions that just have `return;`. Having nothingness as a real type makes it just work using one set of rules for types, rather than having rules for real types plus exceptions for "syntactic elements".
let name = match color_code {
0 => "red",
1 => "blue",
2 => "green",
_ => "unknown",
};
The RHS of the `=>` has to be an expression, since we're assigning it to a variable. Here, you should already see one "useful" side-effect of what you're calling "syntactic elements" (I'd perhaps call them "block statements", which I think is closer to the spirit of what you're saying.) The whole `match … {}` in the example above here is an expression (we assign the evaluation of it to a variable).> What practical value is it to users to allow "return <expr>" to itself be an expression?
Now, what if I need to return an error?
let name = match color_code {
0 => "red",
1 => "blue",
2 => "green",
_ => return Err("unknown color"),
};
The expression arms need to be the same type (or what is the type of `name`?). So now the type of the last branch is !. (Which as you hopefully learned from TFA, coerces to any type, here, to &str.)There's more ways this "block statements are actually expressions" is useful. The need not be a ternary operator / keyword (like C, C++, Python, JS, etc.):
let x = if cond { a } else { b };
In fact, if you're familiar with JavaScript, there I want this pattern, but it is not to be had: const x; // but x's value will depend on a computation:
// This is illegal.
if(foo) {
x = 3;
} else {
x = 4;
}
// It's doable, but ugly:
const x = (function() { if(foo) { return 3; } else { return 4; }})();
// (Yes, you can do this example with a ternary.
// Imagine the if branches are a bit more complicated than a ternary,
// e.g., like 2 statements.)
Similarly, loops can return a value, and that's a useful pattern sometimes: let x = loop {
// e.g., find a value in a datastructure. Compute something. Etc.
if all_done {
break result;
}
};
And blocks: let x = {
// compute x; intermediate variables are properly scoped
// & cleaned up at block close.
//
// There's also a slight visual benefit of "here we compute x" is
// pretty clearly denoted.
};
> Even if you disagree with this argument, this design decision has resulted in all these weird/confusing but absolutely useless code examplesI think one can cook up weird code examples in any language.
I am incredibly amused that I got downvoted to -1 for mentioning perl though. People here are Weird.
I mean you don't see any of the nonsense in the blog post in any realistic PR (so they don't matter),
but you would run into subtle edge case issues if some expressions where more special then other expressions (so that does matter),
especially in context of macros/proc macros or partial "in-progress" code changes (which is also why `use` allows some "strange" {-brace usage or why a lot of things allow optional trailing `,` all of that makes auto code gen simpler).
There are very few statements: https://doc.rust-lang.org/stable/reference/statements.html
and a lot of expressions: https://doc.rust-lang.org/stable/reference/expressions.html
This is what I find interesting in this generation of languages though. Any C programmer understands the notion of an infinite loop, and the value of conditional expressions like ternary ops. But now languages are realizing that when you start treating more and more things as expressions, you really want to start giving names to things that you wouldn't name in the past.
In Scala no one uses "return" (mostly because we don't care about performance in the same way), but if you do, the way it is internally implemented is by throwing exceptions, so in a sense it suffers from the same problems as Rust.
It's actually very important to have that type in a language that uses immutable collections. Imagine this pseudocode:
// List() creates an immutable list
let emptyList = List()
let listWithAnInteger = emptyList.add(42)
let listWithAString = emptyList.add("foo")
This works in Scala. But how can the compiler know that `emptyList.add(42)` is allowed? After all, you can only add things to a list where the added element matches the type of the other elements right?The reason this works is because the type of emptyList will be List<Nothing> and since Nothing a subtype of every other type, the type of listWithAnInteger will become List<Integer>. You can annotate these types explicitly if you want.
Every language without such a bottom type has a failed type-system in my opinion. (looking at you Golang and many others)
There's issues around doing this with mutable collections (i.e. the value restriction) but that's not what you're referring to...
That alone would not work. Think about it: `List a` means "A list that contains values of type `a` and `a` can be any type whatsoever". Now imagine you combine that list with a list of integers. That obviously cannot work, since `(++) :: [a] -> [a] -> [a]` as you see, the types must align.
The way Haskell fixes that is (apparently) by doing something called `Let-generalisation` (https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/let_...)
To me that feels like hacky way to exactly resolve the problem that I described, and if you turn it off then that code would stop working and fail to compile as expected.
This is obviously only possible if the list itself has no elements, and indeed a simple proof is that the statement above is valid for the empty type: all elements of a list of type List a have type empty (among other types). Thus there are no elements in this list.
And both Haskell or OCaml can prove it:
type empty = | (* this is defining a never type *)
type polymorphic_list = { l: 'a. 'a list }
(* OCaml require to explicit construct polymorphic type *)
let polymorphic_lists_are_empty ({l} : polymorphic_list ) =
match (l:empty list) with
| [] -> () (* this is the empty list *)
| _ -> .
(* this clause requires to the OCaml typechecker to prove that the remaining cases are unreachable *)
Indeed, I stand corrected. Interesting! I'll look into that a bit more, thank you.
https://kotlinlang.org/api/core/kotlin-stdlib/kotlin/-nothin...
As you can never get a value of type nothing, it can coerce into anything, just like rust's ! or () or typescripts never.
[1]: https://leanprover-community.github.io/mathlib4_docs/Init/Pr...
Now, yes, ideally you'd have effects in the type system so that you can express this kind of stuff with more precision. But if you restrict this to stuff like return/break/continue where the destination is statically known and can be validated, you can treat those effect types as been there, just inferred for all expressions and forbidden to cross the function boundary.
For exceptions specifically this trick no longer works because the whole point is for them to cross that boundary. But the amount of complexity this stuff adds to typing even trivial generic code is arguably too much for practical use (see also: checked exceptions in Java). In any case, in Rust you use Result types instead so those exceptions produce regular values. And although panics can be handled, they are certainly not meant to be used as a generic mechanism for transfer of control, so adding effect types for them alone is just not worth it.
enum Never {
}
Languages like OCaml, Haskell as well as Rust have types with no values (called uninhabited types)Return is a statement in the minds of most programmers, but an expression in the language. That was a very pragmatic decision that required an unintuitive implementation. As a result, we've got this post full of code that is valid to the compiler but doesn't make a lick of sense to most programmers reading it.
I would take issue with this, sure, for a lot of people, they may be bringing assumptions over from languages where assignment is a statement. That doesn't make them correct.
> required an unintuitive implementation
To some people, sure. To others, it is not unintuitive. It's very regular, and people who get used to "everything is an expression" languages tend to prefer it, I've found.
This feels awkward as my mental model is that in "everything is an expression" languages you simply DO NOT offer "return" (and, if you do, it must be mapped to bottom and do something insane like throw an exception... but, like, if you are really used to using such a language, you'd never let yourself type a "return", as the entire concept feels icky and wrong in such a language).
It is of course not statically typed.
I.e., if we bias our sample to the data points proving our point then our point is proven. It's like that quip about how every car insurance company can simultaneously claim "people who switched saved hundreds of dollars in average."
I also like "everything is an expression" languages, but I don't think that's a fantastic argument.
A better question at this point, arguably, is why there should even be an expression/statement distinction in the first place. All imperative statements can be reasonably and sensibly represented as expressions that produce either () or "never". Semicolon then is just a sequencing operator, like comma in C++.
In fact, I recently ran into the finding that you can't use it with logical operators either: `return myBool && throw...` doesn't work. I assume that's because && can be used with many types even if the first operand is a bool, but the compiler error message doesn't explain that, it just says throw is an invalid token here, and if you parenthesize it, it says a throw expression can't be used in this context. I was very surprised by this seemingly arbitrary limitation.
The main difference from other MLs is the lack of higher kinded types, so it’s difficult to express things like Functor, Monad, Arrow, etc
I wouldn't call using a uninhabited type for the type of a return expression theoretically inelegant. On the contrary, I find it quite pleasing.
Likely the same should apply to expressions of type `()`.
So Rust does have: String + ! = String
But Rust doesn't have: String + i32 = Either<String,i32>
Note that the never type ! isn't special here, Rust will also cheerfully: String + Infallible = String or if you were to define your own empty type like so:
enum MyEmptyType {} // MyEmptyType has no possible values
Now under type arithmetic String + MyEmptyType = String and indeed that works in Rust.Edited: Syntax fix
Syntax alone can't stop sufficiently determined fools. Lisp has famously simple syntax, but can easily be written in an incomprehensible way. Assembly languages have very restrictive syntax, but that doesn't make them easy to comprehend.
Rust already has a pretty strong type system and tons of lints that stop more bad programs than many other languages.
"Expressibility" and "expressive power" are vague and subjective, so it's not clear what you mean.
I suppose you object to orthogonality in the syntax? Golang and Java definitely lack it.
But you also mention C in the context of "maximum possible flexibility"? There's barely any in there. I can only agree it has mistakes for others to learn from.
There's hardly any commonality between the languages you list. C# keeps adding clever syntax sugar, while Go officially gave up on removing its noisiest boilerplate.
D has fun stuff like UFCS, template metaprogramming, string mixins, lambdas — enough to create "incomprehensible" code if you wanted to.
You're talking about modern languages vs relics of the past, but all the languages you mention are older than Rust.
If you want your code to be secure, you need it to be correct. And in order for it to be correct, it needs to be comprehensible first. And that requires syntax and semantics devoid of weird surprises.
The point of a language that is "safe" along some axes is that it makes those unsafe things impossible to represent, either by omitting an unsafe feature entirely, or making it a compile-time error to do unsafe/unsound things.
I will admit that this is something of a grey area, since we're talking about logic errors here and not (for example) memory-safety bugs. It's a bit muddier.
In general, though, I do agree that people should write code that is reasonable to read, and if a reviewer thinks some code in a PR is incomprehensible, they should reject it.
The difficulty in reviewing pointer dereferences is in reasoning about potential program's states and necessary preconditions, which C won't do for you. You can have neatly written C using very simple syntax, and still have no idea if it's safe or not. Solving that lack of clarity requires much than syntax-level changes.
OTOH the Weird Rust examples are not a problem you get in your own code. It's a local syntax problem, and it doesn't require complex whole-program reasoning. The stakes are also lower, because you still have the same safety checks, type checks, automatic memory management, immutability. The compiler aggressively warns about unreachable code and unused/unread variables, so it's not easy to write undetected Weird code.
Rust tried having Underhanded Code Contest, but it has been very Underwhelming.
Rust does not claim to be particularly security-focused, only memory safe.
Also, this means that you'd consider any expression-based language to be inherently a security problem.
Rust is not written as a pure expression based language. And as we all know very well from the experience with C and JS, any unexpected and weird looking code has the potential to hide great harm. Allowing programmers to stray too much from expected idioms is dangerous.
Being security-focused requires you to care about a laundry list of things, including memory safety. But on its own, caring about memory safety just means... you care about memory safety.
It’s not purely expression based but it is very close to it, there’s only a few kinds of statements, the vast majority of things are expressions.
The submission shows weird program snippets. I don’t think it shows weird snippets that can also easily hide bugs?
But also, all examples in TFA are very artificial convoluted code. Meaning that you can write things like these just like you can write something like &&&...x - but why would you? Actual real-world uses of this feature are all quite readable.
type Foo struct{}
func (Foo) Bar() { println("weird...") }
func main() {
([...]func(){^^len(`
`): (&Foo{}).Bar})[cap(append([]any(nil),1,2,3))]()
}
type __ *[]*__