Posted by yakkomajuri 17 hours ago
But a part of me is reading this and thinking "friend... if PostHog was able to do what they're doing on the stack you're abandoning, do you think that stack is actually going to limit your scalability in any way that matters?" Like, you have the counterexample right there! Other companies are making the "technically worse" choice but making it work.
I love coding and I recognize that human beings are made of narratives, but this feels like 3 days you could have spent on customer needs or feature dev or marketing, and instead you rolled around in the code mud for a bit. It's fine to do that every now and then, and if this was a more radical jump (e.g. a BEAM language like Elixir or Gleam, or hell, even Golang, which has that preemptive scheduler + fast compiles/binary deploys + designed around a type system...) than I'd buy it more. And I'm not in your shoes so it's easy to armchair quarterback. But it smells a bit like getting in your head on technical narratives that are more fun to apply your creativity to, instead of the ones your company really needs.
Python didn't cause their problems, Django did. They wanted async, but chose a framework that doesn't really support it. And they weren't even running it on an async app server.
Python didn't work for them because every subsequent choice they made was wrong.
More seriously, I've worked on codebases I found ok, and some I deeply disliked, I guess there's a continuum from "exciting" to "frustrating".
We have a whole posthog interface layer to mask over their constant outages and slowness. (Why don't we ditch them entirely? I, too, often ask this, but the marketing people love it)
Also, considering the project is an AI framework, do you think the language ChatGPT is built on is a worse choice than the language we use because it's in the browser?
Because language bindings isn't really what makes ChatGPT tick.
Personally I don't think there's anything wrong with scratching that itch, especially if its going to make you/your team more comfortable long term. 3 days is probably not make-or-break.
To be honest, I never liked the way async is done in python at all.
However, I love Django and Python in general. When I need "async" in a http cycle flow, I use celery and run it in background.
If client side needs to be updated about the state of the background task, the best is to send the data to a websocket channel known to the client side. Either it's Chat response with LLM or importing a huge CSV file.
Simple rule for me is, "don't waste HTTP time, process quick and return quick".
SSE is nice.
I use a combination or channels and celery for a few projects and it’s works great.
but I still hope at some point they will manage to fix the devx with django/python and async
With LLMS, you shit out working production ready web apps in 2 days now that are quite performant, as long as you don't care about code maintainability long term.
The whole environment is built for async from the ground up. Thousands and thousands of hours put into creating a runtime and language specifically to make async programming feasible. The runtime handles async IO for you with preemptive scheduling. Ability to look at any runtime state on a production instance. Lovely community. More libraries than you might expect. Excellent language in Elixir.
Give it a shot.
People are reimplementing things that are first class citizens in elixir. Live content update, job runners, queues... Everything is built into the language. Sure you can do it all in typescript, but by then you'll be importing lots of libraries, reimplementing stuff with less reliability and offloading things like queues to third party solutions like pulsar or kafka.
People really should try elixir. I think the initial investment to train your workforce pays itself really quick when you don't have to debug your own schedulers and integrations with third party solutions. Plus it makes it really easy to scale after you have a working solution in elixir.
It's interesting, for some people Elixir really clicks, others can't make heads or tails of it. I don't mind Erlang either, but I understand that that is really an acquired taste.
But your comment has convinced me to try it since I am having a bit of NextJS burnout.
what about elixir that eliminates the need for kafka. simple queues I understand but kafka ?
There are probably less code samples and let’s be honest this is 2025, how well do LLMs generate code for obscure languages where the training data is more sparse?
I've had 3 Elixir jobs and 2 Rust jobs in the last 10 years. All were on real products, not vaporware. I learned a ton, worked with great people, and made real friends doing it.
Luck? Skill? Who knows. It's not impossible to work with the technology of your choice on problems you find interesting if you're a little intentional.
Nothing ever gets better if everybody just does what's already popular.
He spent time running benchmarks for 0-1 apps and all kinds of other metrics and found basically no appreciable difference in the speed or accuracy of AI at generating Elixir vs. Python. Maybe some difference, but honestly it just doesn't exist enough to matter.
A: why in gods name B: Every language, every framework and every tech stack is 1 month to 5 years away from being legacy crap. Unless you're learning something like KOBOL it's better to be able to use a variety of languages and show that you can adapt.
Most code is boilerplate and that's where LLMs shine, I don't think this specific issue is very important.
LOL. Speaking about absolutely horrible ideas ...
As an acceptor of reality, you can begin to accept that as well.
A lot of the affordances in the ecosystem have been supplanted by more modern solutions for many use cases, like Kubernetes.
Elixir also opens a number of footguns like abuse of macros; these are some of the reasons to second guess switching.
I think that one of the strongest reasons for switching would be that if you are willing to trade off all of this in exchange for the ability to do zero downtime deploys, not just graceful shutdowns and rollovers. Like if you’re building a realtime system with long lived interactions, like air traffic control system or live conferencing systems.
It can sometimes feel like an esoteric or regrettable choice for a rest api or rpc/event driven system. Even if you want a functional language there may be better choices like kotlin.
??
Elixir is strongly but dynamically typed.
On the progress of static typing:
Any recommendations for someone looking to break into the Elixir space in a serious (job-related/production app) way?
I had to switch my project to .NET in the end because it was too hard to find/form a strong Elixir team. Still love Elixir. Indestructible, simple, and everything is easy once you wrap your head around the functional programming.
It. Just. Works.
Obviously that's not going to give you the benefit of a person who has specifically worked in the ecosystem and knows where the missing stairs are, which does definitely have its own kind of value. But overall, I think a big benefit of working in something like Elixir, Clojure, Rust, etc is that it attracts the kind of senior level people who will jump at the opportunity to work with something different.
One nice side effect of having done this is having a small rolodex of other people who are like that.
So, like, if I had a good use case for Elixir and wanted a pal to hack on that thing with, I know a handful of people who I'd call, none of whom have ever used Elixir before but I know would be excited to learn.
Conversely all the node+typescript projects, big and small, have been pretty great the last 10+ years or so. (And the C# .NET ones).
I use python for real data projects, for APIs there are about half a dozen other tech stacks I’d reach for first. I’ll die on this hill these days.
While, `PydanticAI` does the best it can with a limited type system, it just can't match the productivity of typescript.
And I still can't believe what a mess async python is. The worst thing we've encountered was a bug from mixing anyio with asyncio which resulted in our ECS container getting it's CPU pinned to 100% [1]. And constantly running into issue with libraries not handling task cancellation properly.
I get that python has captured the ML ecosystem, but these agent systems are just API calls and parsing json...
edit: ironically I'm the author of a weird third party library trying to second guess the asyncio architecture but mine is good https://awaitlet.sqlalchemy.org/en/latest/ (but I'll likely be retiring it in the coming year due to lack of interest)
FastAPI does have a few benefits over express, auto enforcing json schemas on endpoints is huge, vs the stupidity that is having to define TS types and a second schema that then gets turned into JSON schema that is then attached to an endpoint. That IMHO is the weakest link in the TS backend ecosystem, compiler plugins to convert TS types to runtime types are really needed.
The auto generated docs in FastAPI are also cool, along with the pages that let you test your endpoints. It is funny, Node shops setup a postman subscription for the team and share a bunch of queries, Python gets all that for free.
But man, TS is such a nice language, and Node literally exists to do one thing and one thing only really well: async programming.
Just define all your types as TypeBox schemas and infer the schema from that validator. This way you write it once, it's synced and there's no need for a compiler plugin.
https://github.com/sinclairzx81/typebox?tab=readme-ov-file#u...
The TS compiler should either have an option to pop out JSON schema from TS types or have a well defined plugin system to allow that to happen.
TS being compile time only really limits the language. It was necessary early on to drive adoption, but now days it just sucks.
Very painfully.
I avoid the async libs where possible. I'm not interested in coloring my entire code-base just for convenience.
In my experience async is something that node.js engineers try to develop/use when they come from node.js, and it's not something that python developers use at all. (with the exception of python engineers that add ASGI support to make the language enticing to node developers.)
Once you're in the situation of supporting a production system with some of the limitations mentioned, you also owe it to yourself to truly evaluate all available options. A rewrite is rarely the right solution. From an engineering standpoint, assuming you knew the requirements pretty early on, painting yourself into a bad enough corner to scrap the whole thing and pick a new language gives me significant pause for thought.
In all honesty I consider a lot of this blog post to be a real cause for concern -- the tone, the conflating arguments (if your tests were bad before, just revisit them), the premature concern around scaling. It really feels like they may have jumped to an expensive conclusion without adequate research.
In an interview, I would not advance a candidate like this. If I had a report who exhibited this kind of reasoning, I'd be drilling them on fundamentals and double-checking their work through the entire engineering process.
Moreover, having worked with Django a bit (I certainly don't have as much experience as you do), it seems to me that anything that benefits from asynchrony and is trivial in Node is indeed a pain in Django. Good observability is much harder to achieve (tools generally support Node and its asynchrony out of the box, async python not so much), Celery is decent for long running, background, or fire and forget tasks, but e.g. using it to do some quick parallel work, that'd be a simple Promise.all() is much less performant (serialize your args, put it in redis, wait for a worker to pick it up, etc), doing anything that blocks a thread for a little bit, whether in Django or Celery,is a problem, because you've got a very finite amount of threads (unless you use gevent, which patches stdlib, which is a huge smell in itself), and it's easy to run out of them... Sure, you can work around anything, but with Node you don't have to think about any of this, it just works.
When you're still small, isn't taking a week to move to Node a better choice than first evaluating a solution to each problem, implementing solutions, each of which can be more or less smelly (which is something each of your engs will have to learn and maintain... We use celery for this, nginx for that, also gevent here because yada yada, etc etc), which in total might take more days and put a much bigger strain on you in the long term? Whereas with Node, you spend a week, and it all just works in a standard way that everyone understands. It seems to me that exploring other options first would indeed be a better choice, but for a bigger project, not when the rewrite is that small.
Thank you for your answers!
There’s not much software I really dislike but Celery is one.
A nightmare within a nightmare to configure and run.
It's entirely likely that we did something wrong and misused celery. But if many people have problems with using a system correctly then it's also something worth considering.
Django is great but sometimes it seems it just tries to overdo things and make them harder
Trying to async Django is like trying to do skateboard tricks with a shopping cart. Just don't
Working with both sync Django and async FastAPI daily, it’s so easy to screw up async FastAPI and bring things to a halt. If async is such the huge key feature they seem to think it is for their product, then I would agree moving away from Python early while it’s still relatively easy is the right call.
> and we had actually already written our background worker service in Node,
Ok well that’s a little bizarre… why use Django to begin with if you are not going to use the huge ecosystem that comes with it. New Django has first-class support for background workers, not that Celery is difficult to get setup. It’s sounds like the engineering team just started building things in what they knew without any real technical planning and the async hiccup is more or less an excuse to get things in order after the fact.
This sounds like standard case going with what developers know instead of evaluating tool for job.
I work on a large Django codebase at work, and this is true right up until you stray from the "Django happy path". As soon as you hit something Django doesn't support, you're back to lego-ing a solution together except you now have to do it in a framework with a lot of magic and assumptions to work around.
It's the normal problem with large and all-encompassing frameworks. They abstract around a large surface area, usually in a complex way, to allow things like a uniform API to caches even though the caches themselves support different features. That's great until it doesn't do something you need, and then you end up unwinding that complicated abstraction and it's worse than if you'd just used the native client for the cache.
I guess if you write a lot of custom code into specific hooks that Django offers or use inheritance heavily it can start to hurt. But at the end of the day, it's just python code and you don't have to use abstractions that hurt you.
Could you be more specific? Don't get me wrong, I'm well aware that npm dependency graph mgmt is a PITA, but curious where you an into a wall w/ Node.
As far as going with what you know vs choosing the best tool for the job, that can be a bit of a balancing act. I generally believe that you should go with what the team knows if it is good enough, but you need to be willing to change your mind when it is no longer good enough.
A company using 2.7 in 2022 is an indicator that the company as a whole doesn't really prioritize IT, or at least the project the OP worked on. By 2017 or so, it should have been clear that whatever dependencies they were waiting on originally were not going to receive updates to support python3 and alternative arrangements should be made.
It got this bad because the whole thing "just worked" in the background without issues. "Don't fix what isn't broken" was the business viewpoint.
"Python doesn't have native async file I/O." - like almost everybody, as "sane" file async IO on Linux is somehow new (io_uring)
Anyway ..
They claim about an 8x improvement in speed.
All-in, there's no single silver bullet to solving a given issue. Python has a lot of ecosystem around it in terms of integrations that you may or may not need that might be harder with JS. It really just depends.
Glad your migration/switch went relatively smoothly all the same.