Rust's Block Pattern - Hacker News

Posted by zdw 5 days ago

Rust's Block Pattern(notgull.net)

217 points | 114 comments

koakuma-chan 4 days ago|

I have one better: the try block pattern.

https://doc.rust-lang.org/beta/unstable-book/language-featur...

Sytten 4 days ago||

I want that stabilized so bad but it's not been really moving forward.

mbrubeck 4 days ago|||

There's some active work recently on fixing blocking issues, e.g.:

https://github.com/rust-lang/rust/pull/148725

https://github.com/rust-lang/rust/pull/149489

mmastrac 4 days ago||||

I was not a fan when I first saw it but I'm becoming desperate to have it the more Rust I write.

koakuma-chan 4 days ago||||

#![feature(try_blocks)]

You only live once.

dwattttt 4 days ago||

I've tried it recently, from memory error inference wasn't that great through it.

JoshTriplett 4 days ago||

That's exactly what's currently being fixed before stabilizing it.

stouset 4 days ago|||

Out of curiosity why can’t a block just do this natively?

masklinn 4 days ago|||

Because it would massively alter langage semantics? It converts returns from the nearest function into returns from the nearest (try) block.

lunar_mycroft 4 days ago|||

Because then you couldn't use ? to propagate errors if they occurred inside any loops or branches within the function, which would be a significant limitation.

loeg 4 days ago|||

Can this just be done as a lambda that is immediately evaluated? It's just much more verbose.

    let x = (|| -> Result<i32, std::num::ParseIntError> {
         Ok("1".parse::<i32>()?
          + "2".parse::<i32>()?
          + "3".parse::<i32>()?)
    })();

rendaw 4 days ago|||

That prevents other control flow mechanisms (return, break) from operating past the function boundary. In general, I avoid single-callsite functions as much as possible (including the iterator api) for this reason.

ahartmetz 4 days ago||

It sounds like you're fighting the language - Rust is sort of FP-light and you're encouraged to return a null/error value from the intermediate calculation instead of doing an early return from the outer scope. It's a nice and easy to follow way to structure the code IME. Yes, it's more verbose when an early return would have been just right - so be it.

dzaima 4 days ago||

For the case where `try` is useful over the functional form (i.e. parent's situation of having a desired Result, plus some unrelated early-returning), that ends up with nested `Result`s though, i.e. spamming an `Ok(Ok(x))` on all the non-erroring cases, which gets ugly fast.

skribanto 4 days ago||

Why couldnt you flatten it?

dzaima 2 days ago||

You have three different value cases (main value, main Err case for `?` to consume, and whatever early-return case). And the `?` operator fully taking up the Err result case means your main-result+early-return values strictly must both be wrapped in an Ok.

schneems 4 days ago||||

Wouldn't that also move any referenced variables too? Unlike the block example that would make this code not identical to what it's replacing.

pflanze 4 days ago||

No, unless you ask for it via the `move` keyword in front of the closure.

This works fine: https://play.rust-lang.org/?version=stable&mode=debug&editio...

saghm 4 days ago|||

My instinct is this would get hairy much faster if you want to actually close over variables compared to using a block.

ahartmetz 4 days ago|||

Not sure if that is relevant to your point, but: For better and for worse, closing over any outer scope variables is syntactically free in Rust lambdas. You just access them.

nicoburns 4 days ago||

It's syntactically free, but it can cause borrow-checker errors thst cause your code to outright fail to compile.

saghm 4 days ago||

Yes, exactly. My concerns were semantic, not syntactic.

loeg 4 days ago|||

If the verbose return type syntax can't be elided, I think it's more or less dead as a pattern.

tayo42 4 days ago|||

Why does this need special syntax? Couldn't blocks do this if the expression returns a result in the end?

bobbylarrybobby 4 days ago|||

Try blocks let you encapsulate the early-return behavior of Try-returning operations so that they don't leak through to the surrounding function. This lets you use the ? operator 1. when the Try type doesn't match that of the function this is taking place in 2. when you want to use ? to short circuit, but don't want to return from the enclosing function. For instance, in a function returning Result<T,E>, you could have a try block where you do a bunch of operations with Option and make use of the ? operator, or have ? produce an Err without returning from the enclosing function. Without try blocks, you pretty much need to define a one-off closure or function so that you can isolate the use of ? within its body.

mwcz 4 days ago||||

The best part of try blocks is the ability to use the ? operator within them. Any block can return a result, but only function blocks (and try blocks) can propagate an Err with the ? operator.

jeroenhd 4 days ago||||

Not without being able to use the ? operator.

The closest thing I can think of that will let you return a result from within a separate scope using a set of foo()? calls would be a lambda function that's called immediately, but that has its own problems when it comes to moving and it probably doesn't compile to very fast code either. Something like https://play.rust-lang.org/?version=stable&mode=debug&editio...

koakuma-chan 4 days ago|||

One reason is that would be a breaking change.

valcron1000 4 days ago|||

One of the first things I tried in Rust a couple of years ago coming from Haskell. Unfortunately it's still not stabilized :(

oniony 4 days ago|||

Now that is pretty cool.

satvikpendem 4 days ago||

Ah yes, do-notation.

emtel 4 days ago||

There are some situations with tricky lifetime issues that are almost impossible to write without this pattern. Trying to break code out into functions would force you to name all the types (not even possible for closures) or use generics (which can lead to difficulties specifying all required trait bounds), and `drop()` on its own is of no use since it doesn't effect the lexical lifetimes.

nemo1618 4 days ago|

Conversely, I use this "block pattern" a lot, and sometimes it causes lifetime issues:

    let foo: &[SomeType] = {
        let mut foo = vec![];
        // ... initialize foo ...
        &foo
    };

This doesn't work: the memory is owned by the Vec, whose lifetime is tied to the block, so the slice is invalid outside of that block. To be fair, it's probably best to just make foo a Vec, and turn it into a slice where needed.

saghm 4 days ago|||

Unless I'm misunderstanding, you'd have the same lifetime issue if you tried to move the block into a function, though. I think the parent comment's point is that it causes fewer issues than abstracting to a separate function, not necessarily compared to inlining everything.

adrianN 4 days ago||||

Avoiding that kind of use after free problem is exactly why people choose Rust, isn’t it?

janquo 4 days ago|||

There is some experimental work for that here I believe:

https://doc.rust-lang.org/beta/unstable-book/language-featur...

AFAIU it essentially creates a variable in inner scope but defers drop to the outer scope so that you can return the reference

bryanlarsen 4 days ago||

More significantly the new variables x and y in the block are Drop'd at the end of the block rather than at the end of the function. This can be significant if:

- Drop does something, like close a file or release a lock, or

- x and y don't have Send and/or Sync, and you have an await point in the function or are doing multi-threaded stuff

This is why you should almost always use std::sync::Mutex rather than tokio::sync::Mutex. std's Mutex isn't Sync/Send, so the compiler will complain if you hold it across an await. Usually you don't want mutex's held across an await.

bryanlarsen 4 days ago||

oops: Of course the Mutex is Sync/Send, that's the whole point of a Mutex. It's the std::sync::MutexGuard that's not.

defen 4 days ago|||

Can this also affect stack usage? Like if `x` gets dropped before `y` is introduced, can `y` reuse `x`'s stack space (let's assume they are same size/alignment). Or does the compiler already do that if it can see that one is not used after the other is introduced?

loeg 4 days ago||

Conceivably, yes.

tstenner 4 days ago||

I have been using this in a web application that acquires a lock, retrieves and returns a few variables to the outer scope an then immediately unlocks the mutex again

JDye 4 days ago||

Our codebase is full of this pattern and I love it. Every time I get clean up temporaries and expose an immutable variable outside of the setup, makes me way too happy.

A lot of the time it looks like this:

  let config = {
      let config = get_config_bytes();
      let mut config = Config::from(config);
      config.do_something_mut();
      config.do_another_mut();
      config
  };

bobbylarrybobby 4 days ago||

You can also de-mut-ify a variable by simply shadowing it with an immutable version of itself:

let mut data = foo(); data.mutate(); let data = data;

May be preferable for short snippets where adding braces, the yielded expression, and indentation is more noise than it's worth.

bryanlarsen 4 days ago|

Variable shadowing felt wrong for a while because it's considered verboten in so many other environments. I use it fairly liberally in rust now.

kibwen 4 days ago||

It helps that the specific pattern of redeclaring a variable just to change its mutability for the remainder of its scope is about the least objectionable use of shadowing possible.

bryanlarsen 2 days ago||

That's not the only place I use shadowing though. I use it much more liberally.

For example I feel this is right:

    let x = x.parse()?;

ziml77 4 days ago||

Blocks being expressions is one of the features of the Rust language I really love (and yes I know it's not something Rust invented, but it's still not in many other popular languages).

That last example is probably my biggest use of it because I hate having variables being unnecessarily mutable.

ghosty141 4 days ago|

In my opinion it's the 'correct' design, I don't see any advantage from not doing this.

nadinengland 5 days ago||

I love that this is part of the syntax.

I typically use closures to do this in other languages, but the syntax is always so cumbersome. You get the "dog balls" that Douglas Crockford always called them:

``` const config = (() => { const raw_data = ...

  ...

  return compiled;

})()'

const result = config.whatever;

// carry on

return result; ```

Really wish block were expressions in more languages.

dwattttt 4 days ago||

By the by, code blocks on here are denoted by two leading spaces on each line

  like
  this

nadinengland 1 day ago||

ah, much appreciated!

notpushkin 4 days ago|||

Interesting that you can use blocks in JS:

  {
    const x = 5;
    x + 5
  }
  // => 10
  x
  // => undefined

But I don’t see a way to get the result out of it. As soon as you try to use it in an expression, it will treat it as an object and fail to parse.

charleszw 4 days ago|||

Yes, I constantly use this pattern in C++/JavaScript, although I haven't tested how performant it is in the former (what does the compiler even do with such an expression?)

paavohtl 4 days ago||

At least in simple cases the compiler will just inline the closure, as if it never existed. There shouldn't be any measurable overhead.

pwdisswordfishy 4 days ago||

https://github.com/tc39/proposal-do-expressions

(Not to be confused with do notation)

saghm 4 days ago||

For those who might not have seen it, you can use this to make a `while` act like a `do-while` loop by putting the entire body in the boolean clause (and then putting an empty block for the actual body):

    // double the value of `x` until it's at least 10
    while { x = x * 2; x < 10 } {}

This isn't something that often will end up being more readable compared to another way to express it (e.g. an unconditional `loop` with a manual `break`, or refactoring the body into a separate function to be called once before entering the loop), but it's a fun trick to show people sometimes.

esafak 4 days ago||

Block expression https://doc.rust-lang.org/reference/expressions/block-expr.h...

Also in Kotlin, Scala, and nim.

IshKebab 4 days ago|

I think this comes from functional programming. I'd just call it "everything is an expression" (which isn't quite true in Rust but it's a lot more true than it is in traditional imperative languages like C++ and Python).

afdbcreid 3 days ago||

The only things in Rust that are real statements are `let` statements, and item statements (e.g. declaring an `fn` inside a function). All other statements are in fact expressions, although some always return `()` so they're not really useful as such.

IshKebab 3 days ago||

Surely declaring structs, traits, top-level functions, etc?

atq2119 4 days ago|

Not mentioned in the article but kinda neat: you can label such a block and break out of it, too! The break takes an argument that becomes the value of the block that is broken out of.

the__alchemist 4 days ago|

I just learned this one, and am gradually starting to use it! It applies for loops too. I saw it in ChatGPT code, and had to stop and look it up. Rust is a big language, for worse and for better.

kibwen 4 days ago|||

I wouldn't call Rust "a big language" because of labeled break. This is a pretty standard language feature, you can do the same in C (and therefore C++), Go, Javascript, Java, C#...

atq2119 4 days ago||

Those languages don't treat blocks as expressions, so you really can't do the same thing there. Something very similar, yes. But not the same.

kibwen 4 days ago||

Those languages aren't expression-oriented, so you would need to assign the result to a previously-initialized variable in a higher scope. But that just makes this pattern clunkier in those languages. This subthread is about jumping to labels, which is a relatively obscure yet widespread feature supported by many languages (though C and Go allow forward jumps, and the rest only allow backward jumps, since the latter ensures that control flow does not become irreducible).

tialaramex 4 days ago|||

  break 'label value;

... is something to be used very sparingly. I reckon I write a new one about once a year.

Very often if you think harder you realise you didn't want this, you should write say, a function (from which you can return) or actually you didn't want to break early at all. Not always, but often. If you write more "break 'label value" than just break then you are almost certainly Doing It Wrong™.

the__alchemist 4 days ago||

Not having put it into practice yet, there is a pattern I use regularly which I plan to replace with the labeled one: I set a flag at the top of the loop I have an inner loop. The inner loop can set this flag. Directly past the inner loop, I check for the flag, then break. I am pretty sure this is exactly what the labeled break is for.

More comments...