Posted by surprisetalk 4 days ago
Highly recommend people mess around with Purescript, you can feel how much pressure is relieved thanks to the row polymorphism tooling almost instantly. Truly, all we wanted as an industry is an easy way to bundle together various tags into our types, and row polymorphism gets you there.
I think row polymorphism is a fairly straightforward thing compared to dependent types in general, but can let you crush a whole class of errors and pay almost nothing in terms of expression costs.
The crucial point was that structural typing on which row-polymorphism is based can model such open-world situations.
Also, having such a system can free you from having overly nested types.
It would be great if Purescript or row-polymorphism became more popular.
It's just unfortunate golang interfaces dont support fields. Only methods. Typescript fares better with its interface type
Why is that unfortunate? Usually when defining interfaces you care about the API surface without wanting to care about the internals of what will eventually implement that API. If you suddenly also spec fields with the interface, wouldn't be too easy to couple the internals to the API?
I can't say I've programmed in Go too much, so maybe I'm missing something very obvious.
I can't remember the specifics for why fields cannot be used within a Go interface but I do remember missing it a few times while writing Go code.
Struct field access is cheap, hopping through a dynamic dispatch table is less cheap.
> I think row polymorphism is a fairly straightforward thing compared to dependent types in general, but can let you crush a whole class of errors [...]
Would you care to provide a few examples? I don't have experience with row polymorphism so I'm genuinely curious.
greet :: forall r. { name :: String | r } -> String
greet person = "Hello, " <> person.name <> "!"
greetWithAge :: forall r. { name :: String, age :: Int | r } -> String
greetWithAge person =
"Hello, " <> person.name <> "! You are " <> show person.age <> " years old."
main :: Effect Unit
main = do
let person = { name: "Alice", age: 30, occupation: "Engineer" }
-- greet can accept the person record even though it has more fields
log (greet person) -- Output: "Hello, Alice!"
-- greetWithAge can also accept the person record
log (greetWithAge person)
In Practice, row polymorphism is more granular, allowing you to explicitly allow certain fields while tracking all other fields via a ("rest") type variable.
Example: PureScript allows you to remove specific fields from a record type. This feature, is called record subtraction, and it allows more flexibility when transforming or narrowing down records.
You can also apply exact field constraints; meaning, you can constrain records to have exactly the fields you specify.
Lastly, PureScript allows you to abstract over rows using higher-kinded types. You can create polymorphic functions that accept any record with a flexible set of fields and can transform or manipulate those fields in various ways. This level of abstraction is not possible in TypeScript.
These are just a few examples. In the most general sense, you can think of row polymorphism as a really robust tool that gives you a ton of flexibility regarding strictness and validation.
TypeScript does allow you to remove specific fields, if I understand you right [0]:
function removeField<T, K extends keyof T>(obj: T, field: K): Omit<T, K> {
const { [field]: _, ...rest } = obj;
return rest;
}
type Person = { name: string; age: number };
declare const p: Person;
const result = removeField(p, 'age'); // result is of type: Omit<Person, "age">
> PureScript allows you to abstract over rows using higher-kinded types. You can create polymorphic functions that accept any record with a flexible set of fields and can transform or manipulate those fields in various ways. This level of abstraction is not possible in TypeScript.Again, if I understand you correctly, then TypeScript is able to do fancy manipulations of arbitrary records [1]:
type StringToNumber<T> = {
[K in keyof T]: T[K] extends string ? number : T[K]
}
function stringToLength<T extends Record<string, unknown>>(obj: T): StringToNumber<T> {
const result: Record<string, unknown> = {};
for (const key in obj) {
result[key] = typeof obj[key] === 'string' ? obj[key].length : obj[key];
}
return result as StringToNumber<T>;
}
const data = {
name: "Alice",
age: 30,
city: "New York"
};
const lengths = stringToLength(data);
lengths.name // number
lengths.age // number
lengths.city // number
[0] https://www.typescriptlang.org/play/?#code/GYVwdgxgLglg9mABA...[1] https://www.typescriptlang.org/play/?#code/C4TwDgpgBAysBOBLA...
edit: provided links to TS playground
const r1: { a: number; b: number } = { a: 10, b: 20 };
const r2: { a: number } = r1;
const r3: { a: number; b: string } = { b: "hello", ...r2 };
console.log(r3.b) // typescript thinks it's a string, but actually it's a number
The problem in question can be "fixed" like this
const r1: { a: number; b: number } = { a: 10, b: 20 };
const r2 = r1 satisfies { a: number };
const r3: { a: number; b: string } = { b: "hello", ...r2 };
Now, TS would warn us that "'b' is specified more than once, so this usage will be overwritten". And if we remove b property -- "Type 'number' is not assignable to type 'string'"Another "fix" would be to avoid using spread operator and specify every property manually .
Both of these solutions are far from ideal, I agree.
---
I don't advocate TS in this thread though; I genuinely want to understand what makes row polymorphism different, and after reading several articles and harassing Claude Sonnet about it, I still didn't grasp what row polymorphism allows over what TS has.
As to the differences with TS... I think they're playing in similar spaces but the monadic do syntax with Purescript lets you use row polymorphism for effect tracking without having to play weird API tricks. In TS that's going to be more difficult.
(short version: in purescript I could write an API client that tracks its state in the types so that you can make sure you authorize before calling some mechanism. In TS you would need to design that API around that concept and do things like client = client.authorize(). Purescript you could just do "authorize" in a monadic context and have the "context" update accordingly)
I implemented the above as a toy type checker. I found the above combination of features too complicated and they end up being unintuitive for the user: the type errors are difficult to comprehend when type errors are found. My implementation is here: https://gist.github.com/kccqzy/d761b8adc840333af0303e1b822d7... and I mostly followed the paper but I cannot guarantee there aren't bugs.
Not sure if I misunderstand what you mean, but this does not require subtyping. One of the key distinguishing features of row polymorphism is that exactly this can be achieved without subtyping. The extra unused fields (`y` in your example) are represented as a polymorphic type variable _instead_ of using subtyping. See for instance page 7 in these slides: https://www.cs.cmu.edu/~aldrich/courses/819/slides/rows.pdf
The main difficulty I see with row polymorphism is with field shadowing. For example if you have a record with type {a=bool, x=int, c=unit}, then set the x field with type string instead, the new type should be {a=bool, x=string, c=unit}.
I suppose if you only have syntax for creating a record with a literal, but do not have syntax for updating an existing record this is not a problem.
Can you explain that a little more? Intuitively I would imagine that those y and z fields would 'disappear' into the rest part.
Without subtyping, the rest part needs to be identical for each element of the list. In fact you cannot even express the concept of a list with different rest parts. The key thing to understand is that the rest part never really disappears. The type checker always deduces what the rest part should be in every case. In languages like Haskell you can work around this by using existential quantification but that's a whole different extension to the type system, and one that's certainly not as flexible as full subtyping.
Yes you can - that's just a existential type. I'm not sure what the syntax would be, but it could be somthing like:
List (exists a : {x=int, ...'a})
(In practice (ie if your language doesn't support existential types) you might need to jump through hoops like: List ((forall a : {x=int, ...'a} -> b) -> b)
or whatever the language-appropriate equivalent is, but in that case your list will have been created with the same hoops, so it's a minor annoyance rather than a serious problem.)If you were to go the other direction and choose only subtyping but not row polymorphism to implement records, then you end up co-opting things like intersection types inappropriately leading to unsoundness.
If there is one thing that is for sure, it is that we have too many names for "collection of name to value mappings".
In my book the term record is not what first comes to mind when thinking of unordered mappings. All of the usages of the word I can think of imply the possibility of access by name while retaining order. Sometimes this allows easy indexed access (database rows used to be called records) sometimes it doesn't (C structs which also used to be called record types).
Also in C++:
struct A { int x; int y; } a;
struct B { int y; int x; } b;
template<class C> concept has_x_y = requires(C c) {
{ c.x } -> std::convertible_to<int>;
{ c.y } -> std::convertible_to<int>;
};
int sum(has_x_y auto z) { return z.x + z.y; }
...
sum(a);
sum(b);
As two structs that have same names at different places in the struct still need to conform to the generic record type
Access by name is not even truly necessary and the difference between tuples and records is minimal (you could build something looking exactly like field access on top of tuples with functions and would get as a result something indistinguishable from an actual record).
The existence or not of an order is then totally accessory and it's generally straightforward to build an order on both provided the data types they contain is orderable by ordering the field and then using a lexicographic order.
The iteration order seems arbitrary to a human, but that is exclusively because function(name, insertion_id) is not optimized for a human. It seems strange to call the collection “unordered” because of how it appears to a human.
I wonder if there's a way to efficiently implement it without resorting to monomorphization?
A function that's polymorphic can be transformed into a more primitive (say C or assembly) function that gets extra arguments that carry the "shape" of the type variable (think of sizes, pointers vs. values, etc.). Is there a similar strategy for these polymorphic records?
I see two issues:
1. The offset of any particular field in the record is unknown at compile time
2. The size of the record itself is unknown at compile time (but this should be trivial as an extra argument.)
For instance, if the "prototype" of the argument is {int foo, float bar}, and I supply {int foo, int baz, float bar}, the table will be {foo: base+0 bytes, bar: base+8 bytes}.
Depends on what you consider "efficient".
Monomorphization is necessary for the most efficient code. But you can have a vtable or a restricted type lookup.
That is not always true. Monomorphisation also leads to code size increase, because the function is compiled for each type. This may decrease cache efficiency.
But not always, and won't always increase your performance either.
[0] https://osa1.net/posts/2023-01-23-fast-polymorphic-record-ac...
void f({float x, float y} p);
Becomes
void f(void* p, size_t offsets[2]);
But some Portuguese might remember "Samad" a mock person used in some of the examples. Which of course is "Damas" spelled backwards.
Relevant reddit comment: https://www.reddit.com/r/devpt/comments/qujip3/comment/hlmb9...