Posted by thunderbong 2 days ago
It's possible to have a scripting language support extra command line arguments after the null byte, which is less disruptive to the syntax than recognizing arguments from a second line.
I.e.
#!/path/to/interpreter --arg<NUL>--more --args<LF>
Or #!/usr/bin/env interpreter<NUL>--all --args<LF>
On some OS's, you only get one arg: everything after the space, to the end of the line, is one argument.When we stick a <NUL> there, that argument stops there; but our interpreter can read the whole line including the <NUL> up to the <LF> and then extract additional arguments between <NUL> and <LF>
https://www.nongnu.org/txr/txr-manpage.html#N-74C247FD
The interpreter could get the arguments in other ways, like from a second line after the hash bang line. But with the null hack, all the processing revolves around just the one hash bang line. You can retrofit this logic into an interpreter that already knows how to ignore the hash bang line, without doing any work beyond getting it to load the line properly with the embedded nul, and extract the arguments. You dont have to alter the syntax to specially recognize a hash bang continuation line.
> If during parsing lines in the script, ksh finds a NUL byte on the line, it should abort ("syntax error: NUL byte unexpected").
https://www.undeadly.org/cgi?action=article;sid=202409241057...
ksh is being successfully run, and is able to read a line of the script and find the null byte.
However, ksh rejecting it means we couldn't use the trick with ksh, like to get it to ignore a <CR> in the hash bang line of a script that has <CR><LF> line endings. (Something discussed in a sibling subthread, in regard to a Perl script that failed due to a trailing <CR> in the hash bang line.)
I had a Perl script (way) back in the day that came from a Windows system and it wouldn't work on Linux. After I figured out <cr><nl> was causing the problem, I figured it out what bin_script (might have been in bin_misc) was doing wrong. bin_script sees "/bin/perl<cr>" and then fails to find that interpreter.
So I proposed a one line change which fixed the glitch and posted it to LKML … and promptly got yelled at by Allan Cox for breaking compatibility. I dunno if the null byte breaks the same compatibility. Chapter and verse weren't cited.
About the only way it could break would be if the kernel used a string function to look for the newline, like a range-limited form of strchr, and then aborted the hash bang dispatch with an error upon not finding the newline, rather than accepting that the argument is delimited by a null.
I tested it on various platforms like MacOS, Solaris, some BSDs, Cygwin, Linux. Far from exhaustive but a good coverage of the modern desktop and server landscape.
The null byte would have fixed your Perl script without having to convert the line endings; the argument would have been delimited, in spite of the line ending in <CR><LF>.
But can you / somebody please explain what this means
According to the official Kernel Admin Guide:
This Kernel feature allows you to invoke almost (for restrictions see below) every program by simply typing its name in the shell. This includes for example compiled Java(TM), Python or Emacs programs. To achieve this you must tell binfmt_misc which interpreter has to be invoked with which binary. Binfmt_misc recognises the binary-type by matching some bytes at the beginning of the file with a magic byte sequence (masking out specified bits) you have supplied. Binfmt_misc can also recognise a filename extension aka .com or .exe.
It’s another way to tell the Kernel what interpreter to run when invoking a program that’s not native (ELF). For scripts (text files) we mostly use a shebang, but for byte-coded binaries, such as Java’s JAR or Mono EXE files, it’s the way to go!
Like, can you give me an example by what you mean. What are its use cases, if any. I read it many times and always with some sort of enthusiasm because of this sentence ending in exclamation point making me feel like it's huge yet I just can't understand it's significance.
Does it mean we can have .jar files which can then run shebang like, so we don't need #! , can this also be used for main.go or every other language which has some issues with #! ,
I see there being some interpreter for golang, rust etc. which just compiles it but it was just too complex. I am just imagining something like a simple go file which is valid golang but can be run by linux simply by ./ And it autocompiles it...
For example, the following (which I grabbed from Wikipedia) `:DOSWin:M::MZ::/usr/bin/wine:` will register `/usr/bin/wine` to run as the wrapper for any .exe that gets executed, with no extra config needed. It simply sees that you tried to run a PE file and will run it in wine.
That is really nice actually!
this could theoretically be used as a scriptesto alternative but it requires some command to run , I was hoping there was a way to use something other than shebang
I have it set up for Haskell (it's somewhat hackish): there is preconfigured haskell project in a certain location with desired dependencies, imports, etc., so when executing a .hs file this file is copied into that as Main and ran. A similar setup will work for any language.
Edit: `cabal` and `stack` both have script commands now, so these would be an alternative to the above, downside being that every such script would need the shebang intro with dependencies, etc.
Respectful correction: you feel less mystified, i.e. it has been demystified for you.
PS That nit aside, great question. Sorry I can't provide illumination.
Yeah, I should've said less mystified. Gotta remember it in the future!
Thanks!
Wait, aren't JARs ZIPs (so they have the headers appended on the end of the file)? How does prefix matching help that?
#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
if 0;
See: https://www.perl.com/article/bang-bang/While we're here, can someone explain why `which` prints some locations, and for others the whole darn file? Like `which npm` prints the location; `which nvm` prints the whole darn file.
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \.
"$NVM_DIR/bash_completion" # This loads nvm bash_completion
to your bashrc.If you write a function in your current session for example which will show you the content of that command. If you write that command in a file and put that file in your path which will show you where it is
If your which command is a shell builtin, and nvm is a function, then you’re likely seeing the content of that function.
Make sure your interpreter is very carefully written!
$ cd /tmp
$ ln /path/to/setuid_script -i
$ PATH=.
$ -i
Now you have a root shell!For example:
#!(checksum) python3
could be a secure fix (though not super convenient still, but better than nothing).Sudo, ..., group membership, even using udev to emit devices with those permissions automatically etc.. all work.
It is horses for courses though with different costs and benefits, but the number of use cases that require suid is tiny.
If you give me more details I can make some suggestions.
Doing a strace on Nvidia's container toolkit in rootless mode is a wrapper example IIRC.
Note that the popularity of BusyBox and containers is an example where there is an easy to miss serious holes, as all processes have access to proc/pid/self, it is trivial to access entry points that have a large attack surfaces.
I found a copy of the decades old faq I remember the above vulnerability from that covers other situations.
http://www.faqs.org/faqs/unix-faq/faq/part4/section-7.html
I know all the above seem more complicated, but the reality is invoking an interpreter as suid is subject to Rice's theorm, you don't control the run time (semantic) behavior, so avoiding privalage escalation is impossible.
In an executable, to run suid still requires encoding run time behavior into syntactic behavior which is still a non-trivial task. Especially if you don't leverage privilege/capabilities dropping and or selinux/apparmor/etc.
While suid itself isn't dangerous, auditing the executable to make sure it doesn't introduce risks is hard, especially over time.
Note there are other protections that come into play that are difficult to track down, like the gnu linker explicitly nuking the LD_* env vars on suid, attempting to avoid return oriented programming attacks etc.
If you prefer the logic side consider "does halt" from the implicant form of Horn clauses.
(∃)(∃)(q_f,x,y)
The above is derivable from clauses if and only if the machine halts. Which is just HALT in prolog.Having potential non-finite models is what generalizes to Rice, and extends to partial and even total functions in finite time.
The perfectly implemented code, or typical problem of perfectly modify code in the future, is hard with executables and simply intractable for interpreters.
So what you are asking for, a secure fix simply doesn't exist for suid interpreters. If we could, we would solve HALT.
You can use env[0] to invoke a command based on `$PATH` if desired:
The env utility uses the PATH environment variable to
locate the requested utility if the name contains no `/'
characters, unless the -P option has been specified.
For shells which have limited shebang functionality[1], specifying the POSIX shell location and then using its builtin `exec` support can suffice.0 - https://man.freebsd.org/cgi/man.cgi?query=env&apropos=0&sekt...
you need to know where the script is to read the shebang... if it's in the user's home directory, you are already there or you either found it on the path or typed it in... then with bash, tilde's everywhere else you need them?