I mean, I'm not sure if LLVM parses the assembly (I strongly suspect it does, I remember inline GCC assembly allowed stuff like referencing variables in asm), shouldn't LLVM figure out that the asm modifies things its not supposed to?
If you clobber a register in asm the compiler stores something into, your code certainly won't work right.
I mean one that infers as much context as possible and tries to help as much as possible.
This has to be assembler specific of course. For example, I use fasm which has higher level macros. An LSP could suggest struct fields and other stuff.
Inline asm should take 10x or more effort compared to writing the surrounding c++ code and should be tested with protected pages at the edges if possible. It should always have assertions before/after that check invariants too.
Also there are at a lot of cases that this won’t work. One example is implementing strlen using avx512 where you want to align the address down to a multiple of 64 and run until the end of the page, so you can do simd while avoiding segfault.
Another example is just handling loop remainders with masking in avx512.
Also it is pretty naive to think an LLM got this right
Overall it seems like a huge waste of time.
If you are writing inline asm and want to make it better, just get as many LLMs or, even better, humans to review it. LLMs are really good at finding mistakes in inline asm, with a high false positive rate though, so you have to understand the concept.
For example one bug I had was about not consuming the inputs before writing to the outputs. Compiler can assign the same register to input and outputs unless outputs are marked with & (or something like that). It was super frustrating to debug this until I asked an LLM and it found the problem.
I don't know how the author's proprietary LLM swarm handled the job but his stated approach sounds reasonable to me.
This doesn't sound right to me and I wrote a decent amount of inline assembly in C like C++ code.
Are you saying this because you had unexpected memory safety bugs in inline asm?
Rust's unsafes are likely safe.
Assembly snippets in the Linux kernel are likely safe.
These statements have no bearing on whether the present asm block being compiled right now is actually for–a–fact safe.
When none of the instructions perform a memory access, that is a guarantee.
As a diagnostics tool, Fil-C finds issues that are rarely present on any code I work on. A large subset of the issues are C++–adjacent.
I still believe its ideas —if applied correctly— can secure systems where someone thought an object system hacked in a weekend in C belongs in the tool "all LLMs depend on" or whatever.
BTW I do hand–write general–purpose assembly that is not a straight–forward intrinsic–equivalent and early drafts are full of all sorts of memory and register bank safety bugs.