Overall I think there is going to be a lot of "old" gpu compute hanging around, and now that writing kernels is a lot easier than it has been, we might as well try and see what algorithms we can get working there.
I originally picked up Mojo for the SIMD, not for the GPU kernels. The SIMD usability in Mojo is outstanding.
Paper on the tool I wrote: https://doi.org/10.1093/bioadv/vbaf292
What's "alignment" in your context. In bioimaging it usually refers to aligning something to a reference atlas (like the Allen Reference Mouse Brain Atlas) or aligning two microscope channels (like the red channel and green channel)
From my experience, AI revolves a lot around building up function pipelines, computing their derivatives, and passing tons of data through them; which composability and higher order functions from functional programming make it a breeze to describe.
I also feel that other fields than AI are moving towards building up large functional pipelines to produce outputs, which would make mojo suitable for those fields as well. I’m building in the space of CAD for example and I’d love to use a “functional mojo” language.
I think that nowadays with vibe/agentic coding, high performance Python-like languages become ever more important. Directly using AI agents to code, say, C++, is painful as the verbose nature of the language often causes the context window to explode.
Microsoft is invested into using AI for C++ code review, for example.
If more than a few percent of execution time is spent in Python you are probably doing it wrong.
Personally I don't even understand why Cython is a thing, just write performance critical functions in other languages:
<https://pypi.org/project/rustimport/>
<https://pypi.org/project/import-zig/>
Note that you can even start threads in those languages and use function calls as pseudo-RPC. All without an overly complex build system.
Also tools like numba can beat them all at way less effort.
Imho, dropping into other languages should be the last resort in any project.
Already available on GCC 16.
Every program that starts with 1% of Python writes more Python and gets to 20,40, 60 and than 99% of it.
Meanwhile Julia is more mature for the same purposes, and since last year NVidia is having feature parity between Python and C++ tooling on CUDA.
Python cuTile JIT compiler allows writing CUDA kernels in straight Python.
AMD and Intel are following up with similar approaches.
So will Mojo still arrive on time to gain wider adoption?
Time will tell.
I bet that’s true for a great many people. There are too many wonderful FOSS languages to bother with one you can’t fix or adapt or share.
One would want to see either a strong community build up around it, or really hard evidence for a long-term commitment to the language from Modular. And the latter will take a long time to be assured of I think.
Also, editing tools need to catch up before very wide adoption of a language with a lot of new syntax.
- The MLIR approach, which was also designed by Chris Lattner while at Google, has proven quite valuable to create Python JIT DSL
- The Python ecosystem now being taken seriously by the main GPU vendors, thanks to MLIR, as all their proprietary compilers are based out of LLVM
- Others remember Swift for Tensorflow
For example you need to use some Mojo Max library to have fast optimized LLM kernels, but at that point you can just stay with Python that has fast optimized LLM kernels, like flashinfer.
Things like optimizing away object allocations, pure function inlining, tail call optimization?