Composition Shouldn't be this Hard

Posted by larelli 6 hours ago

Composition Shouldn't be this Hard(www.cambra.dev)

68 points | 46 commentspage 3

BoppreH 4 hours ago|

I agree that this is our profession's Achilles heel. But I see one more cause: binaries and processes are black boxes unless they deliberately implement an interconnect feature. Try to extract the list of playlists from a Spotify client, for example.

We lucked into filesystems that have open structures (even if the data is opaque). Perhaps we should be pushing for "in-memory filesystems" as a default way of storing runtime data, for example.

athrowaway3z 4 hours ago||

> I think I’ve found a model that can break out of this tradeoff. Implementing it is more than I can do alone

I think anything that can change this has to be simple enough that it'd be more effective to just explain the system and implement it, than wax about the general outline of part of the problem. Especially since the real target audience for an initial release by necessity needs to understand it.

There are some big leaps we could make with having code be more flat. Things like having the frontend and backend handler in the same file under the same compiler/type checker. But somebody will want to interact with a system outside of the 'known-world' and then you're writing bindings and https://xkcd.com/927/

At the end of the day I think the core tension is that once the speed of light is noticeable to your usecase things become distributed, which creates the desire for separate rate-of-change. I'm not sure what would 'solve' that.

AI will be a plus, for the fact that a single team can be in charge of more of the parts leading to a more coherent whole.

Hope OP builds some nice tools, but I've seen too many of these attempts fail to get excited about "i think we found it".

simianwords 5 hours ago||

This reminds me of Rama [1] from Red Planet Labs

[1] https://redplanetlabs.com/programming-model

> What is Rama? Rama is a platform for building distributed backends as single programs. Instead of stitching together databases, queues, caches, and stream processors, you write one application that handles event ingestion, processing, and storage.

LeCompteSftware 4 hours ago||

There's a contradiction here that needs to be untangled:

  There are many examples of models that enable coherent systems within specific domains:

  - Type systems in programming languages catch many logic errors and interface misuses

  - The relational model in databases enables programmers to access incredible scale and performance with minimal effort.

  [...]

  So coherent systems are great: everyone should just buy into whatever model will most effectively do the job. Right? Unfortunately, the listed models are all domain-specific–they don’t generalize to other contexts. And most modern internet software is not domain-specific. Modern applications typically span a wide variety of domains, including web and API serving, transaction processing, background processing, analytical processing, and telemetry. That means that trying to keep a system coherent limits what that system can ultimately do. As one implements more capabilities, application requirements push us outside of a single domain, forcing us to reach for components with a different internal model. So, bit by bit, our system fragments.

The problem of course is that type systems and databases are not meaningfully "domain-specific." They aren't technical magic bullets but they separately provide real value for the use cases of "web and API serving, transaction processing, background processing, analytical processing, and telemetry." So then why hasn't the industry settled on a specific type system? Why do database vendors (and the SQL standard) keep breaking the relational model in favor of something ad hoc and irritating?

I believe the real problem is that software is symbolic and the problems it solves usually aren't. Writing an application means committing to a certain set of symbolic axioms and derivation schemas, and these are never going to encapsulate the complexity of the real world. This relates to Greenspun's 10th rule:

  Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

Or in a modern context, C++/C# and managing a huge amount of configuration data with a janky JSON/XML parser, often gussied up as an "entity component system" in game development, or a "DSL" in enterprise. The entirely equivalent alternative is a huge amount of (deterministic!) compile-time code generation. Any specific symbolic system small enough to be useful to humans is eventually going to go "out of sync" with the real world. The authors hint at this with the discrepancy between SQL's type system and that of most programming languages, but this is a historical artifact. The real problem is that language designers make different tradeoffs when designing their type system, and I believe this tradeoff is essentially fundamental. Lisp is a dynamically-typed s-expression parser and Lisp programs benefit from being able to quickly and easily deal with an arbitrary tree of whatever objects. In C#/C++ you would either have to do some painful generics boilerplate (likely codegen with C#) or box everything as System.Object / void pointer and actually lose some of the type safety that Lisp provides. OTOH Idris and Lean can do heterogeneous lists and trees a little more easily, but that cost is badly paid for in compilation times, and AFAICT it'll still demand irritating "mother may I?" boilerplate to please the typechecker. There is a fundamental tradeoff that seems innate to the idea of communicating with relatively short strings of relatively few symbols.

This sounds like Godel incompleteness, and it's a related idea. But this has more to do with cognition and linguistics. I wish I was able to write a little more coherently about this... I guess I should collect some references and put together a blog at some point.

Toutouxc 3 hours ago|

> The problem of course is that type systems and databases are not meaningfully "domain-specific." They aren't technical magic bullets but they separately provide real value for the use cases of "web and API serving, transaction processing, background processing, analytical processing, and telemetry." So then why hasn't the industry settled on a specific type system? Why do database vendors (and the SQL standard) keep breaking the relational model in favor of something ad hoc and irritating?

I'm not sure what point you're trying to make here. The list you're referring to is definitely a bit hand-wavy, but it also makes sense to me to read it as, for example, "today's relational databases (software) are almost perfectly aligned to the domain of relational databases (concept)". As in, MariaDB running on my Mac wraps an insane amount of complexity and smarts in a very coherent system that only exposes a handful of general concepts.

The concepts don't match what I'd like to work with in my Rails app, which makes the combination of both a "fragmented system", as the article calls it, but the database itself, the columns, tables, rows and SQL above it all, that's coherent and very powerful.

LeCompteSftware 3 hours ago||

It depends on what you mean by "almost perfectly aligned to the domain of relational databases" but by my standards I can't think of a single production database where that's true, in large part because it's not true for SQL itself.

- Tables are not relations. Tables are multisets, allowing duplicate rows, whereas relations always have a de facto primary key. SQL is fundamentally a table language, not a relational language.

- NULL values are not allowed in relations, but they are in SQL. In particular, there's nothing relational about an outer join.

In both cases they are basically unscientific kludges imposed by the demands of real databases in real problems. "NULL" points to the absence of a coherent answer to a symbolic rule, requiring ad hoc domain-specific handling. So this isn't a pedantic point: most people wouldn't want to use a database that didn't allow duplicate rows (the SQL standard committee mentioned a cash register receipt with multiple entries that don't need to be distinguished, just counted). Nullable operations are obviously practical even if they're obviously messy. Sometimes you just want the vague structure of a table, a theory that's entire structural and has no semantics whatsoever. But doing so severely complicates the nice symbolic theory of relational algebra.

That's the point I'm getting at: there isn't really a "domain" limitation for relational algebra, it's more that there's a fundamental tradeoff between "formal symbolic completeness" and "practical ability to deal with real problems." Eventually when you're dealing with real problems, practicality demands kludges.

AlexRexh 5 hours ago||

Oh good

jiggawatts 5 hours ago|

> "we believe advances in programming language theory and database systems have opened a path that wasn’t available before"

Which is tantamount to waving one's hands about and saying there's "New magic!(tm)"

... while standing next to a pile of discarded old magic that didn't work out.

This blog post says nothing about what makes Cambra's approach unique and likely to succeed; it is just a list of (valid) complaints about the status quo.

I'm guessing they want to build a "cathedral" instead of the current "bazaar" of components, perhaps like Heroku or Terraform, but "better"? I wish them luck! They're going to need it...