Demystifying the (Shebang): Kernel Adventures

Posted by thunderbong 4/10/2025

Demystifying the (Shebang): Kernel Adventures(crocidb.com)

189 points | 36 comments

kazinator 4/10/2025|

Fun fact: you can stick a null byte into the shebang line to terminate it, as an alterantive to the newline.

It's possible to have a scripting language support extra command line arguments after the null byte, which is less disruptive to the syntax than recognizing arguments from a second line.

I.e.

  #!/path/to/interpreter --arg<NUL>--more --args<LF>

  #!/usr/bin/env interpreter<NUL>--all --args<LF>

On some OS's, you only get one arg: everything after the space, to the end of the line, is one argument.

When we stick a <NUL> there, that argument stops there; but our interpreter can read the whole line including the <NUL> up to the <LF> and then extract additional arguments between <NUL> and <LF>

https://www.nongnu.org/txr/txr-manpage.html#N-74C247FD

The interpreter could get the arguments in other ways, like from a second line after the hash bang line. But with the null hack, all the processing revolves around just the one hash bang line. You can retrofit this logic into an interpreter that already knows how to ignore the hash bang line, without doing any work beyond getting it to load the line properly with the embedded nul, and extract the arguments. You dont have to alter the syntax to specially recognize a hash bang continuation line.

ElectricalUnion 4/11/2025||

Yeah, OpenBSD says no to that.

> If during parsing lines in the script, ksh finds a NUL byte on the line, it should abort ("syntax error: NUL byte unexpected").

https://www.undeadly.org/cgi?action=article;sid=202409241057...

hnlmorg 4/11/2025|||

I think the GP was talking about a kernel parsing and how to stuff additional parameters in a hypothetical new scripting language. Whereas you’re talking about ksh specifically, which has its own specific parsing rules.

kazinator 4/11/2025|||

OpenBSD is one of the platforms I tested the Hash Bang Null Hack on.

ksh is being successfully run, and is able to read a line of the script and find the null byte.

However, ksh rejecting it means we couldn't use the trick with ksh, like to get it to ignore a <CR> in the hash bang line of a script that has <CR><LF> line endings. (Something discussed in a sibling subthread, in regard to a Perl script that failed due to a trailing <CR> in the hash bang line.)

CalChris 4/10/2025|||

Less fun fact: you can't substitute a <cr><nl> for <nl>.

I had a Perl script (way) back in the day that came from a Windows system and it wouldn't work on Linux. After I figured out <cr><nl> was causing the problem, I figured it out what bin_script (might have been in bin_misc) was doing wrong. bin_script sees "/bin/perl<cr>" and then fails to find that interpreter.

So I proposed a one line change which fixed the glitch and posted it to LKML … and promptly got yelled at by Allan Cox for breaking compatibility. I dunno if the null byte breaks the same compatibility. Chapter and verse weren't cited.

kazinator 4/10/2025||

Null de facto works, and it's almost certainly due to a consequence of the kernel treating the result of extracting the argument as a C string. For instance, it might actually be scanning past the NUL and earnestly finding the newline. Even if that entire datum is copied into the argument vector and passed to the interpreter. the interpreter will only see the argument up to the null terminator, due to it being a C string.

About the only way it could break would be if the kernel used a string function to look for the newline, like a range-limited form of strchr, and then aborted the hash bang dispatch with an error upon not finding the newline, rather than accepting that the argument is delimited by a null.

I tested it on various platforms like MacOS, Solaris, some BSDs, Cygwin, Linux. Far from exhaustive but a good coverage of the modern desktop and server landscape.

The null byte would have fixed your Perl script without having to convert the line endings; the argument would have been delimited, in spite of the line ending in <CR><LF>.

cryptonector 4/11/2025||

Shit, how did I not know this. I love it! Thanks!

spudlyo 4/10/2025||

If you found this article interesting, you might also enjoy "My Own Private Binary: An Idiosyncratic Introduction to Linux Kernel Modules"[0] and the previous discussion[1] of it on HN.

[0]: https://www.muppetlabs.com/~breadbox/txt/mopb.html

[1]: https://news.ycombinator.com/item?id=29291804

Imustaskforhelp 4/10/2025||

Read your article, it's really nice. I really feel much less demystified by this.

But can you / somebody please explain what this means

According to the official Kernel Admin Guide:

This Kernel feature allows you to invoke almost (for restrictions see below) every program by simply typing its name in the shell. This includes for example compiled Java(TM), Python or Emacs programs. To achieve this you must tell binfmt_misc which interpreter has to be invoked with which binary. Binfmt_misc recognises the binary-type by matching some bytes at the beginning of the file with a magic byte sequence (masking out specified bits) you have supplied. Binfmt_misc can also recognise a filename extension aka .com or .exe.

It’s another way to tell the Kernel what interpreter to run when invoking a program that’s not native (ELF). For scripts (text files) we mostly use a shebang, but for byte-coded binaries, such as Java’s JAR or Mono EXE files, it’s the way to go!

Like, can you give me an example by what you mean. What are its use cases, if any. I read it many times and always with some sort of enthusiasm because of this sentence ending in exclamation point making me feel like it's huge yet I just can't understand it's significance.

Does it mean we can have .jar files which can then run shebang like, so we don't need #! , can this also be used for main.go or every other language which has some issues with #! ,

I see there being some interpreter for golang, rust etc. which just compiles it but it was just too complex. I am just imagining something like a simple go file which is valid golang but can be run by linux simply by ./ And it autocompiles it...

ckatri 4/10/2025||

The best and most common uses for this are Wine and qemu-static.

For example, the following (which I grabbed from Wikipedia) `:DOSWin:M::MZ::/usr/bin/wine:` will register `/usr/bin/wine` to run as the wrapper for any .exe that gets executed, with no extra config needed. It simply sees that you tried to run a PE file and will run it in wine.

Imustaskforhelp 4/11/2025||

okay so now you can have some command which you can run which would do somethings with procfs/what not and now you can simply run any jar or exe file as it is....

That is really nice actually!

this could theoretically be used as a scriptesto alternative but it requires some command to run , I was hoping there was a way to use something other than shebang

pkaye 4/10/2025|||

Yes you can use binfmt_misc to allow arbitrary executable file format to be passed to an interpreter matched either by a filename extension or a magic number at a specific offset within the executable.

https://en.wikipedia.org/wiki/Binfmt_misc

kreetx 4/11/2025||

Yup, this way you can skip the shebang line for your scripts in some arbitrary language.

I have it set up for Haskell (it's somewhat hackish): there is preconfigured haskell project in a certain location with desired dependencies, imports, etc., so when executing a .hs file this file is copied into that as Main and ran. A similar setup will work for any language.

Edit: `cabal` and `stack` both have script commands now, so these would be an alternative to the above, downside being that every such script would need the shebang intro with dependencies, etc.

chrisweekly 4/11/2025|||

> "I really feel much less demystified by this."

Respectful correction: you feel less mystified, i.e. it has been demystified for you.

PS That nit aside, great question. Sorry I can't provide illumination.

brookst 4/11/2025|||

I also can’t answer, but I will adopt “thanks, you’ve made me less demystified” to use alongside “I’ll give your suggestion all the attention it merits.”

Imustaskforhelp 4/11/2025|||

Um it was 2 in the morning when I had written that comment.

Yeah, I should've said less mystified. Gotta remember it in the future!

Thanks!

ElectricalUnion 4/11/2025||

> byte-coded binaries, such as Java’s JAR

Wait, aren't JARs ZIPs (so they have the headers appended on the end of the file)? How does prefix matching help that?

AdieuToLogic 4/11/2025||

One of my favourite old-school Perl magic spells used to portably handle broken shells is:

  #!/usr/bin/perl
  eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if 0;

See: https://www.perl.com/article/bang-bang/

lelandfe 4/10/2025||

> Since I never remember which one is which, a good way to check is using the utility `file`: `file $(which useradd)`

While we're here, can someone explain why `which` prints some locations, and for others the whole darn file? Like `which npm` prints the location; `which nvm` prints the whole darn file.

awbraunstein 4/10/2025||

`nvm` isn't a file, it is a bash function defined in some file (likely ~/.nvm/nvm.sh). So when you say `which nvm` it prints out the definition of the `nvm` function. This is setup when you added something like:

    export NVM_DIR="$HOME/.nvm"
    [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
    [ -s "$NVM_DIR/bash_completion" ] && \. 
 "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

to your bashrc.

YardenZamir 4/10/2025|||

I can't say the reason, but i can note the pattern. If it's something in your path, like a program or a script which will show you where it is. If it's a shell function sourced you will see the whole thing.

If you write a function in your current session for example which will show you the content of that command. If you write that command in a file and put that file in your path which will show you where it is

adrianmonk 4/10/2025|||

For this situation, in bash, use 'type nvm' (instead of 'which nvm' or 'file nvm'). It will tell you what 'nvm' is (executable, shell alias, shell function, shell builtin, etc.), which will probably solve the mystery.

AlienRobot 4/10/2025|||

That sounds odd. Try using command -v instead?

cstrahan 4/10/2025||

Are you sure that what is being printed is the contents of a file? And which shell are you using?

If your which command is a shell builtin, and nvm is a function, then you’re likely seeing the content of that function.

davis 4/10/2025||

Articles like this are just such a delight. History + common software + code snippets is a great combo

fsckboy 4/11/2025|

there's some history missing. when i learned unix, any text file with the x bit set was run as a /bin/sh script, unless you put a comment as the first line and then it was considered /bin/csh

amelius 4/10/2025||

How do I fix my kernel so that I can use the setuid bit with shebang?

o11c 4/11/2025||

You can already do this. Just register the binfmt-misc with the `C` flag; binfmt-misc takes priority over builtin binfmts.

Make sure your interpreter is very carefully written!

nyrikki 4/10/2025||

You don't, privilege escalation is trivial so it will ignore suid/sgid, as an example with a shell in the shebang

        $ cd /tmp
        $ ln /path/to/setuid_script -i
        $ PATH=.
        $ -i

Now you have a root shell!

amelius 4/11/2025||

Clear, but I was asking for a fix, and implied a secure fix.

For example:

    #!(checksum) python3

could be a secure fix (though not super convenient still, but better than nothing).

nyrikki 4/11/2025||

The typical solution is to have the interpreter call a wrapper program or better yet use RBOC etc.. to remove the need for suid in the first place.

Sudo, ..., group membership, even using udev to emit devices with those permissions automatically etc.. all work.

It is horses for courses though with different costs and benefits, but the number of use cases that require suid is tiny.

If you give me more details I can make some suggestions.

Doing a strace on Nvidia's container toolkit in rootless mode is a wrapper example IIRC.

Note that the popularity of BusyBox and containers is an example where there is an easy to miss serious holes, as all processes have access to proc/pid/self, it is trivial to access entry points that have a large attack surfaces.

I found a copy of the decades old faq I remember the above vulnerability from that covers other situations.

http://www.faqs.org/faqs/unix-faq/faq/part4/section-7.html

I know all the above seem more complicated, but the reality is invoking an interpreter as suid is subject to Rice's theorm, you don't control the run time (semantic) behavior, so avoiding privalage escalation is impossible.

In an executable, to run suid still requires encoding run time behavior into syntactic behavior which is still a non-trivial task. Especially if you don't leverage privilege/capabilities dropping and or selinux/apparmor/etc.

While suid itself isn't dangerous, auditing the executable to make sure it doesn't introduce risks is hard, especially over time.

Note there are other protections that come into play that are difficult to track down, like the gnu linker explicitly nuking the LD_* env vars on suid, attempting to avoid return oriented programming attacks etc.

If you prefer the logic side consider "does halt" from the implicant form of Horn clauses.

    (∃)(∃)(q_f,x,y)

The above is derivable from clauses if and only if the machine halts. Which is just HALT in prolog.

Having potential non-finite models is what generalizes to Rice, and extends to partial and even total functions in finite time.

The perfectly implemented code, or typical problem of perfectly modify code in the future, is hard with executables and simply intractable for interpreters.

So what you are asking for, a secure fix simply doesn't exist for suid interpreters. If we could, we would solve HALT.

fracus 4/10/2025||

The shebang seems underbaked to me. There is no way to reference a user's home directory AFIK. I came across this annoyance when trying to make my python virtual environment portable.

AdieuToLogic 4/11/2025||

> The shebang seems underbaked to me. There is no way to reference a user's home directory AFIK. I came across this annoyance when trying to make my python virtual environment portable.

You can use env[0] to invoke a command based on `$PATH` if desired:

  The env utility uses the PATH environment variable to 
  locate the requested utility if the name contains no `/' 
  characters, unless the -P option has been specified.

For shells which have limited shebang functionality[1], specifying the POSIX shell location and then using its builtin `exec` support can suffice.

0 - https://man.freebsd.org/cgi/man.cgi?query=env&apropos=0&sekt...

1 - http://www.acadix.biz/Unix-guide/HTML/ch02s20.html

fsckboy 4/11/2025||

i don't get it.

you need to know where the script is to read the shebang... if it's in the user's home directory, you are already there or you either found it on the path or typed it in... then with bash, tilde's everywhere else you need them?

fracus 4/11/2025|||

I don't want my user's home directory in the script and therefore, the shebang, for portability reasons. The best solution would to be able to refer to the location of the virtual environment in the shebang relative to the location of the script instead of the current working directory.

fsckboy 4/11/2025||

user's home directory is in /etc/passwd and in $HOME in env. the way it's done is, you refer to these