Toasty, an async ORM for Rust

Posted by steveklabnik 3 days ago

204 points | 131 comments

OtomotO 1 day ago|

ORM has never worked for me in any language.

Sooner or later we always hit the n+1 query problem which could only be resolved by a query builder or just plain old sql.

It always was a mess and these days I can't be bothered to try it even anymore because it has cost me a lot of hours and money.

srik 1 day ago||

Yes, plain sql is indeed the bees knees but there are good ORMs like django/ecto etc. that let you consider N+1 query issues ahead of time. Most ORMs these days have escape hatches anyway. Patience might be needed to keep it all tidy but they don't necessarily have to be a mess.

s6af7ygt 1 day ago||

I don't get why to use an ORM in the first place. Just define a bunch of structs, run a query, map results to structs. It's a few lines of simple code. You're in control of everything (the SQL, the running, the mapping). It's transparent. With any ORM, you give away control and make everything more complex, only to make it slightly easier to run a query and map some results.

JodieBenitez 1 day ago|||

> Just define a bunch of structs, run a query, map results to structs

Congrats, you now have your own little ORM.

RandomThoughts3 1 day ago|||

No, absolutely not.

Op is never implying they intend to maintain one to one correspondence between the DB and objects and do that through manipulating objects only. Mapping hand written queries results to structs and updating the DB yourself on the basis of what is in structs is not at all an ORM.

ndriscoll 1 day ago|||

Not in most modern web application servers. ORMs seem to solve the problem of synchronizing some persistent application state (like a desktop GUI app) to your database state, but web application servers are usually relatively stateless. It's better to think of the application's job as taking a request, parsing it, compiling it to SQL, handing that to a database, and serializing the results.

Through that lens, the parts where you load and save object state are redundant. You're going to throw those objects away after the request anyway. Just take your request and build an UPDATE, etc. Use record types merely as a way to define your schema.

tekkk 1 day ago|||

No type safety & writing manual SQL is slower. I get your point but the bottleneck is often developement speed, not query efficiency. I know and hate how stupid the ORM is underneath but I have to admit it's a blessing that I dont have to think about SQL at all (until I do).

jamil7 1 day ago|||

This is pretty much where I landed as well, I also love being able to quickly copy and run SQL queries to test and modify them somewhere else.

JodieBenitez 1 day ago|||

It's not a black or white thing. Good ORMs let you use plain old SQL when needed.

OtomotO 1 day ago||

As said, they have cost me too much time and money already, moreso as other devs on the team(s) lent heavily into certain features and I had to rewrite a lot of code.

0x457 21 hours ago||

Why are you rewritting? 80%[1] of queries most users do can be efficiently handled by ORM. I might need to use hand-written query a few times either because this particular query is faster to write by hand or because ORM builds a a bad query. That is it, no need to throw away entire ORM because of that.

When I was in RoR world, pretty much every N+1 query I saw was due to lack of RTFM.

[1]: I made this up

OtomotO 3 hours ago||

I need to rewrite the parts that are broken and without going into too much details: it's a lot of code where we had no problems with hundreds of rows but now with thousands (so nothing, lol, I've worked on projects with hundreds of millions of rows) we get severe performance problems.

Because it's half a dozen joins and hence no N+1 query but actually N*6+1 queries...

And yes, RTFM is nice, problem is: it's my fucking partners that should've done this before we shipped it to the customer which they abandoned and I did not.

p0w3n3d 21 hours ago||

What do you recommend? Do the mapping manually? Tbh I tried that while learning rust and it was awful.

On the other hand an async orm sounds like (n+1)(n+2)+...+(n+m) Problem

alilleybrinker 3 days ago||

Very interested in exploring how this will compare to Diesel [1] and SeaORM [2], the other two options in this space today. Joshua Mo at Shuttle did a comparison between Diesel and SeaORM in January of this year that was really interesting [3].

[1]: https://diesel.rs/

[2]: https://www.sea-ql.org/SeaORM/

[3]: https://www.shuttle.dev/blog/2024/01/16/best-orm-rust

tuetuopay 3 days ago||

My first reaction is this feels like a nice middleground between Diesel and SeaORM.

The codegen part makes all columns and tables and stuff checked at compile-time (name and type) like Diesel, with a query builder that's more natural like SeaORM. I hope the query builder does not end up too magical like SQLAlchemy with its load of footguns, and stay close in spirit to Diesel that's "write sql in rust syntax".

I think time will tell, and for now I'm keeping my Diesel in production :D

karunamurti 3 days ago|||

Sea ORM is too opinionated in my experience. Even making migration is not trivial with their own DSL. Diesel was ok, but I never use it anymore since rocket moved to async.

I'm mainly use sqlx, it's simple to use, there's query! and query_as! macro which is good enough for most of the case.

sampullman 1 day ago|||

I use SQLx, but I'm not totally convinced it's better than writing raw SQL with the underlying postgres/sqlite3/mysql driver. The macros and typing fall apart as soon as you need anything more complicated than basic a SELECT with one to one relationships, much less one/many to many.

I remember fighting with handling enums in relations for a while, and now just default to manually mapping everything.

echelon 1 day ago||

SQLx can handle complicated queries as long as they're completely static strings. We've got SELECT FOR UPDATE, upserts, and some crazy hundred-line queries that are fine with their macros.

SQLx sucks at dynamic queries. Dynamic predicates, WHERE IN clauses, etc.

For SQLx to be much more useful, their static type checker needs to figure out how to work against these. And it needs a better query builder DSL.

sampullman 1 day ago||

Right, it's not bad if you stick with what the type checker can handle, but I usually end up falling back on manual building with the majority of queries in any semi-complex app.

It doesn't end up being too bad though, except for the loss of compile time syntax checking. Manually handling joins can be kind of nice, it's easier to see optimizations when everything is explicit.

sverro2 2 days ago|||

I like sqlx, but have been eyeing diesel for some time. Any reasons you don't use diesel_async?

malodyets 1 day ago||

With Diesel async integrating everything with the pooling is a bit hairy. With sqlx everything just works.

Onavo 3 days ago||

It's nice seeing more Django/Prisma style ORMs where the non-SQL source code is the source of truth for the schema and migrations are automatically generated.

xpe 1 day ago||

I wish the following three paragraphs were widely read and understood by all software developers, especially web developers:

> The common wisdom is to maximize productivity when performance is less critical. I agree with this position. When building a web application, performance is a secondary concern to productivity. So why are teams adopting Rust more often where performance is less critical? It is because once you learn Rust, you can be very productive.

> Productivity is complex and multifaceted. We can all agree that Rust's edit-compile-test cycle could be quicker. This friction is countered by fewer bugs, production issues, and a robust long-term maintenance story (Rust's borrow checker tends to incentivize more maintainable code). Additionally, because Rust can work well for many use cases, whether infrastructure-level server cases, higher-level web applications, or even in the client (browser via WASM and iOS, MacOS, Windows, etc. natively), Rust has an excellent code-reuse story. Internal libraries can be written once and reused in all of these contexts.

> So, while Rust might not be the most productive programming language for prototyping, it is very competitive for projects that will be around for years.

rfoo 1 day ago|

I'd add that a lot of the described advantages come from culture. For web applications manual memory management is 100% a friction instead of a relief. But the culture in Rust community in general, at least for the past ten years or so, is to encourage a coding style with inherently fewer bugs and more reusable, maintainable code, to the point of consistently want something to not happen if they weren't sure they got it right (one may argue that this is counter-production short-term).

It is this culture thing makes adopting Rust for web apps worthwhile - it counters the drawback of manual memory management.

If you hire an engineer already familiar with Rust you are sure you get someone who is sane. If you onboard someone with no Rust background you can be pretty sure that they are going to learn the right way (tm) to do everything, or fail to make any meaningful contribution, instead of becoming a -10x engineer.

If you work in a place with a healthy engineering culture, trains people well, with good infra, it doesn't really matter, you may as well use C++. But for us not so lucky, Rust helps a lot, and it is not about memory safety, at all.

xpe 4 hours ago||

I haven’t worked at a place that checks the above boxes for making C++ a great choice for bulletproof code. There seems to be large variation in C++ styles and quality across projects. But it seems to me that for orgs that indeed do C++ well, thanks to the supporting aspects above, moving to Rust might make things even smoother.

the__alchemist 1 day ago||

Et tu, toasty?

As time passes, the more I feel a minority in adoring rust, while detesting Async. I have attempted it a number of times, but it seems incompatible with my brain's idea of structure. Not asynchronous or concurrent programming, but Async/Await in rust. It appears that most of the networking libraries have committed to this path, and embedded it moving in its direction.

I bring this up because a main reason for my distaste is Async's incompatibility with non-Async. I also bring this up because lack of a Django or SQLAlchemy-style ORM is one reason I continue to write web applications in Python.

jasdfuwjass 1 day ago||

> I bring this up because a main reason for my distaste is Async's incompatibility with non-Async. I also bring this up because lack of a Django or SQLAlchemy-style ORM is one reason I continue to write web applications in Python.

So you use gevent/greenlet?

littlestymaar 1 day ago||

Async code is not incompatible with blocking one, in Rust it's quite straightforward to make the two interoperate: calling a blocking code from async is donne with spawn_blocking and the reverse (async from blocking code) is done with block_on.

the__alchemist 1 day ago||

I think this is core to the disconnect: Non Async/await does not imply blocking.

trevyn 1 day ago|||

Non-async functions are absolutely blocking. The question is if they’re expected to block for a meaningful amount of time, which is generally suggested by your async runtime.

It’s really not that bad, you might just need a better mental model of what’s actually happening.

LtdJorge 1 day ago||

Depends, on Linux you can call set_nonblocking on a TcpListener and get a WouldBlock error whenever a read would block. That's called non-blocking.

jasdfuwjass 1 day ago||

Doesn't this miss the forest for the trees? The entire point is to drive with epoll.

LtdJorge 1 day ago||

Well, yes. But it means you can do sync non-blocking IO by hand.

littlestymaar 1 day ago|||

If it's neither blocking nor async then it's a completely regular function and you don't even have to call it with spawn blocking, there's nothing that prevent calling a normal function from an async one.

And in the opposite situation, if you call an async function then you are doing IO so your function must be either async or blocking, there's no third way in this direction, so when you're doing IO you have to make a choice: you either make it explicit (and thus declare the function async) or you hide it (by making a blocking call).

A blocking function is just a function doing IO that hides it from the type system and pretend to be a regular function.

tempest_ 1 day ago||

A blocking function is one that blocks the event loop from switching to another task. It doesnt matter what it is doing only that it is doing something and not hitting another await to release the loop to work on another task. A simple function with while loop can block the event loop if it doesnt contain any awaits in it.

littlestymaar 1 day ago||

This is an implementation detail that can leak from single-threaded event loops (JavaScript typically) but this isn't true of multithreaded event loops, which can even have a preemption mechanism for long running tasks (for instance in Rust async-std has one IIRC).

There's a fundamental difference between CPU heavy workload that keep a thread busy and a blocking syscall: if you have as many CPU heavy tasks as CPU cores then there's fundamentally not much to do about it and it means your server is under-dimensioned for your workload, whereas a blocking syscall is purely virtual blocking that can be side-stepped.

LtdJorge 1 day ago||

Rust executors don't have real preemption, sadly. I'd love to have in Rust what the BEAM has for Erlang, block all you want, the rest of the processes (tasks) still run in time.

Also, the IO and the execution being completely tied (the executor provides the IO) is a wrong choice in my opinion. Hopefully in the future there is a way to implement async IO via Futures without relying on the executor, maybe by std providing more than just a waker in the passed-in context.

littlestymaar 22 hours ago||

> Also, the IO and the execution being completely tied (the executor provides the IO) is a wrong choice in my opinion.

It's more a consequence of having let tokio becoming the default runtime instead of having the foundational building blocks in the standard library than a language issue. But yes, the end result is unfortunate.

didip 1 day ago||

I think the custom schema definition file is not needed. Just define it in plain Rust. Not sure what the win is for this tool.

Ciantic 1 day ago||

It is nice to see more ORMs, but inventing a new file format and language `toasty` isn't my cup of tea. I'd rather define the models in Rust and let the generator emit more Rust files.

Creating your own file format is always difficult. Now, you have to come up with syntax highlighting, refactoring support, go to definition, etc. When I prototype, I tend to rename a lot of my columns and move them around. That is when robust refactoring support, which the language's own LSP already provides, is beneficial, and this approach throws them all away.

BluSyn 1 day ago||

My experience with Prisma, which has a very similar DSL for defining schemas, has changed my mind on this. Makes me much more productive when maintaining large schemas. I can make a one line change in the schema file and instantly have types, models, and up/down migrations generated and applied, and can be guaranteed correct. No issues with schema drift between different environments or type differences in my code vs db.

Prisma is popular enough it also has LSP and syntax highlighting widely available. For simple DSL this is actually very easy build. Excited to have something similar in Rust ecosystem.

simonask 1 day ago|||

I mostly agree with this, but the trouble is (probably) that proc-macros are heavy-handed, inflexible, and not great for compile times.

In this case, for example, it looks like the generated code needs global knowledge of related ORM types in the data model, and that just isn't supported by proc-macros. You could push some of that into the trait system, but it would be complex to the point where a custom DSL starts to look appealing.

Proc-macros also cannot be run "offline", i.e. you can't commit their output to version control. They run every time the compiler runs, slowing down `cargo check` and rust-analyzer.

trevyn 1 day ago||

You can absolutely do global knowledge in proc macros via the filesystem and commit their output to version control: https://github.com/trevyn/turbosql

satvikpendem 1 day ago||

Looks similar to Prisma Client Rust but because Prisma and its file format are already established unlike toasty files, might be easier to use that. However, this is by Tokio and PCR is relatively unknown with development being not too fast, so your mileage may vary. I've been using diesel (with diesel_async) so far.

Sytten 3 days ago||

For me diesel hits right balance since it is more a query builder and it is close to the SQL syntax. But sometimes it doesn't work because it is very strongly typed, right now I use sea-query for those scenarios and I built the bridge between the two.

Ideally I would use something akin to Go Jet.

aabhay 1 day ago||

Interesting take!

In my experience, Dynamo and other NoSQL systems are really expressive and powerful when you take the plunge and make your own ORM. That’s because the model of nosql can often play much nicer with somewhat unique structures like

- single table patterns - fully denormalized or graph style structures - compound sort keys (e.g. category prefixed)

Because of that, I would personally recommend developing your own ORM layer, despite the initial cost

smt88 1 day ago||

Why does a NoSQL or denormalized database need an ORM?

Developing your own ORM is almost always a waste of time and a bad idea.

aabhay 1 hour ago||

True, but there are benefits in some instances as well. For example, we store all rows as entity properties, not entities themselves. So a row would be the user’s email, one row for user name, etc. which makes it possible to do razor sharp queries over exactly what is needed. So while that doesn’t imply a standard ORM, if you want a `User` object you must write an ORM layer

fulafel 11 hours ago||

Do you find that you value the relational model that a ORM constructs on top a non-relational DB? Or do you use it more like a "OM" without the R?

aabhay 1 hour ago||

That’s a great point. We don’t really use the R part so much. However, you can’t always avoid it. That said, if your concepts in the table themselves can be atomic or isolated then yes your object model can just be a wrapper of sorts that bundles convenience functionality around the row data.

colesantiago 3 days ago|

I don't get the pent up anger with ORMs, I used it for my SaaS on Flask that I run and own for 4 years bringing in over $2M+ ARR with no issues.

Great to see some development in this for Rust, perhaps after it becomes stable I may even switch my SaaS to it.

jeremyloy_wt 1 day ago||

The second that you would benefit from using a DBMS specific feature, the ORM begins getting in the way. It is highly unlikely that an ORM provides support, much less a good abstraction, over features that only 1/N supported DBMS have.

Your code ends up using the driver raw in these cases, so why not just use the driver for everything? Your codebase would be consistent at that point

fiedzia 1 day ago|||

>The second that you would benefit from using a DBMS specific feature, the ORM begins getting in the way.

You can extend diesel (and probably many other orms, Diesel is just particularly easy here) to support any db feature you want.

> It is highly unlikely that an ORM provides support, much less a good abstraction, over features that only 1/N supported DBMS have.

That depends on orm flexibility and popularity. It may not provide support OOTB, but can make it easy to add it.

> Your code ends up using the driver raw in these cases, so why not just use the driver for everything? Your codebase would be consistent at that point

Main point of using orm for me is that I have type verification, raw (as in text) breaks too easily.

simonask 1 day ago||

You can extend diesel in theory, but can you really in practice? In my experience, it's very hard to work with once you get into the weeds. It's a big mess of very complicated generic signatures.

Might have improved since last I checked, but I was pretty confused.

fiedzia 23 hours ago||

I've added some sql functions, and support for decimal type for mysql (It didn't have it at some point). Wasn't complicated.

rtpg 1 day ago||||

I have found that ORM arguments in context don’t stick very well to Django’s ORM, but see the argument applying well to most all the others.

Case in point Django is really good about DB-specific functionality and letting you easily add in extension-specific stuff. They treat “you can only do this with raw” more or less as an ORM design API issue.

My biggest critique of Django’s ORM is its grouping and select clause behavior can be pretty magical, but I’ve never been able to find a good API improvement to tackle that.

OJFord 1 day ago|||

Here's one: https://stackoverflow.com/questions/65596920/use-django-subq...

globular-toast 1 day ago|||

Django's ORM is the worst for object-relational impedance mismatch, though. Django is great if you're happy with thinly-veiled database tables. But it absolutely sucks if what you want is real objects representing business entities.

The simplest example is you can't build a Django object with a collection on it. Take the simplest toy example: a todo list. The natural model is simple: a todo list has a name and a list of items. You can't do that in Django. Instead you have to do exactly what you would do in SQL: two tables with item having a foreign key. There's no way to just construct a list with items in it. You can't test any business rules on the list without creating persistent objects in a db. It's crazy.

So yeah, Django lets you do loads with the relational side, but that's because it's doing a half-arsed job of mapping these to objects.

rtpg 1 day ago||

I mean first of all you could "just" use an array field for your list of items. Single model.

But then you have actual properties on your todo list. So even in your object model you already have two classes, and your todo list has a name and a list of items.

So there's not one class, there's two classes already.

As to "having a list", Django gives you reverse relations so you can do `my_list.items.all()`. Beyond the fact that your persistence layer being a database meaning that you need to do _something_, you're really not far off.

One could complain that `my_list.save()` doesn't magically know to save all of your items in your one-to-many. But I think your complaint is less about the relational model and much more about the "data persistence" question. And Django gives you plenty of tools to choose how to resolve the data persistence question very easily (including overriding `save` to save some list of objects you have on your main object! It's just a for loop!)

globular-toast 1 day ago||

Using an array is just giving up on a relational database. In fact what you'd do is use a JSON field, but at that point you don't need an ORM, just use an object database.

You can only do `my_list.items.all()` if you've already saved the related records in the db. And if you do something like `my_list.items.filter(...)` well that's another db query. A proper ORM should be able to map relationships to objects, not these thinly veiled db records. See how SQLAlchemy does it to see what I mean. In SQLAlchemy you can fully construct objects with multiple layers of composition and it will only map this to the db when you need it to. That means you can test your models without any kind of db interaction. It's the whole point of using an ORM really.

rtpg 11 hours ago||

I mean if you think SQLAlchemy does the job for you that's great! My general contention is more "there are good ORMs". I believe Django is the good one, but if you think SQLAlchemy works well for you, go for it!

globular-toast 8 hours ago||

They are all useful tools, but I think it's important to keep them in context. I feel like what most people want is the automatic SQL generation from their general purpose language of choice. That and a migration framework. But none of them should be considered a no brainer because they all come with considerable downsides. One of the most difficult things I've found in complex, long running projects is people clinging on to the ORM long after it's ceased to be useful. SQLAlchemy at least lends itself better to proper architecture with it's data mapper, but Django really doesn't like being relegated to a lower level.

viraptor 1 day ago|||

Because you only need the specific features in a tiny amount of cases, while 99% is some flavour of SELECT * ... LEFT JOIN ... (If it's not, then sure, ORM would be annoying)

Making that 99% smaller, simpler and automatically mapping to common types makes development a lot easier/faster. This applies to pretty much any higher level language. It's why you can write in C, but embed an ASM fragment for that one very specific thing instead of going 100% with either one.

jruz 1 day ago|||

You’re probably making so much money that don’t care about your Database bill or query performance. ORM is basically a no-code tool for databases, if that solves your problem great, but that’s not something that would scale beyond basic use.

kyleee 3 days ago||

Has it benefited you? Have you moved to a different underlying SQL software without having to make any changes to your codebase? Or some other benefit?

carlgreene 1 day ago||

For me it’s speed of development. I’m frankly not very good at SQL, but an ORM in a familiar syntax to the language I use most (Typescript) increases my dev speed tremendously.

I also have a relatively successful saas that uses Prisma and it’s been phenomenal. Queries are more than fast enough for my use case and it allows me to just focus on writing more difficult business logic than dealing with complex joins

More comments...