Posted by dima-quant 2 days ago
Because nimic code is just standard Python with type hints and ctypes shims, it is a fully valid CPython script, so you can use the Python REPL during development, drop a breakpoint in the middle of a heavy algorithmic loop and inspect the variables natively.
Zero Lock-In: You don't need a special runtime engine. If the Nim compiler is not available, your script still runs (albeit slower than standard Python due to some emulation overhead) on any machine with Python installed.
Seamless Distribution: You can use this to develop high-performance logic natively in Python, debug it with Python tooling, and then compile to a native executable or C-extension via Nim.
Why Nim? Its syntax maps well to Python, it is rather clear how to emulate its constructions in Python, and its performance is comparable to C (as it compiles to C). Port of the "trace-of-radiance" Nim project to nimic can be found in "ndsl_raytracer" in my GitHub repo (dima-quant). With the compiled executable the render time for a single 512x288 scene dropped from many hours in Python to just 10 minutes on a single M1 CPU core. The repo also includes nimic ppm to mp4 converter.
Similar projects: - Pyccel (https://github.com/pyccel/pyccel): Python extension language using accelerators - SPy (https://github.com/spylang/spy) is a variant of Python specifically designed to be statically compilable while retaining a lot of the "useful" dynamic parts of Python. - Codon (https://github.com/exaloop/codon) is a high-performance Python implementation that compiles to native machine code without any runtime overhead.
It is still work in progress, e.g. there is no JIT and multiprocessing support yet, but now I'm not sure what functionality would be best to implement next. Any suggestions?
The transpile to a language that transpiles to C approach is unusual. Downsides of that other than slower compilation?
No, the problems are a) you get the same exact AI "voice" that is tedious to keep reading, b) it's verbose and focuses on the wrong things (the whole Module Architecture section doesn't belong there), and c) it's a sign that it's slop and not well tested.
> The benchmarks are mostly on runtime performance I assume?
You tell me! You (or Claude) make performance claims - "aiming to get C-level performance without leaving Python". Does it actually get anywhere near that claim?
I've spent a fair bit of time generating specialized straight-line code for hot Python paths, killing the per-call attribute and dict lookups the interpreter does. The lesson was that dispatch-bound code claws back most of its overhead without ever leaving CPython. Where AOT-to-native actually pulls ahead is numeric and loop-bound work, where the interpreter loop and boxing dominate. Your 512x288 render is exactly that case, which is why it looks so strong.
So the benchmark I'd want isn't render time, it's what fraction of a real module transpiles with no rewrites. That number tells me whether this is a systems language or a fast path I have to hand-shape around. Codon and Shedskin both hit that wall. Curious where Nimic draws the line.
Indeed, the AOT compilation leads to great speed-ups for heavy custom numeric calculations that cannot be easily vectorized in numpy, such as the raytracing logic. In cases when most of calculations are performed in an external module (written in e.g. C, Rust or Cython etc) the performance gain might be much less, but the added value here is that the high performance module itself can be written in nimic, keeping the codebase purely in Python and consolidating the codebase.
When starting with a "pythonic" code, rewrites in nimic can be substantial but so would be a rewrite in a systems language like C or Rust. Besides allowing to optimise the fast path, nimic provides low level functionality, such as pointers to pointers and bitwise operations that actually executes within CPython, for example, as in mp4 muxer implementation in dima-quant/ndsl_raytracer/src/nraytracer/minimp4.py
one thing i'd worry about with "runs unmodified in cpython" though: python ints are arbitrary precision and nim's aren't. so the same code can give you a bignum under cpython and a wraparound under the compiled path. how do you handle that, or is matching cpython semantics explicitly a non-goal for the typed subset?
It looks like (please correct me, OP, if I'm wrong) it works the other way around: you use sized int types in Nimic code, and their semantics are emulated on the Python side. See here: https://github.com/dima-quant/nimic/blob/main/src/nimic/ntyp...
So I'd say in Nimic Python you get Nim-style integer emulation when run from Python, keeping both paths consistent with each other - but breaking consistency with the rest of Python. Which is OK, I think, given it's explicitly a subset(s) of the language(s). It would be possible to make `int` transpile to some BigInt Nim implementation, but you'd need an external dependency for this, as they are not in Nim's stdlib. However, in the speed-focused context, I'm not sure if defaulting to BigInt every time the compiler sees an `: int` annotation would work well. It's a hard decision to make. Curious what's the OP opinion here?