The story of Max, a real programmer

Posted by surprisetalk 6 days ago

The story of Max, a real programmer(incoherency.co.uk)

129 points | 93 commentspage 2

fithisux 6 days ago|

I think Max's brain was not polluted with terror and showed trust in his tools.

Today many devs (and not prograamers)

are always suspicious, and terrified on the potential of something going wrong because someone will point a finger

even if the error is harmless or improbable.

My experience is that many modern devs are incapable of assigning significance or probabilities, they are usually not creative, fearful of "not using best practices", and do not take into consideration the anthropic aspect of software.

My 2 cents

xyzzy123 3 days ago||

For years every external pentest of every perimeter of companies with old-school stuff like this has been finding these things and exploiting them and there are usually several webshells and weird stuff already on the server by the time they get to it. Very often the company forgot, or didn't know they had the thing.

The end state of running 15 year old unmaintained PHP is that you accumulate webshells on your server or it gets wiped. Or you just lose it or forget about it, or the server stops running because the same dev practices that got you the PHP means you probably don't bother with things like backups, config management, version control, IaC etc (I don't mean the author, who probably does care about those things, I just mean in general).

If these things are not a big deal (often it is not! and it's fun!) then absolutely go for it. In a non-work context I have no issues.

TBH I'm not 100% sure that either the PHP version _or_ the go versions of that code are free from RCE style problems. I think it depends on server config (modern php defaults are probs fine), binary versions (like an old exiftool would bone you), OS (windows path stuff can be surprising) and internal details about how the commands handle flags and paths. But as you point out, it probably doesn't matter.

Am I just doing the meme? :)

mrheosuper 2 days ago||

OTH, even a small program mistake can be exceptionally expensive. Remember crowdstrike , a NULL deference that cost hundreds of millions of dollars.

An unsafe string can be abused as attack vector to your system.

s1mplicissimus 3 days ago||

> For years Imagebin was wide open to the public and anybody could upload their own images to it. Almost nobody did.

There's your explanation why it could be so simple

stavros 3 days ago|

The next version was equally open to the public.

msteffen 3 days ago||

> The reason the Go code is so much bigger is because it checks and (kind of) handles errors everywhere (?) they could occur

I’ve said before and will say again: error handling is most of what’s hard about programming (certainly most of what’s hard about distributed systems).

I keep looking for a programming language that makes error handling a central part of the design (rather than focusing on non-error control flow of various kinds), but honestly I don’t even know what would be better than the current options (Java/Python’s exceptions, or Go’s multiple returns, or Rust’s similar-seeming Result<T, E>). I know Linus likes using goto for errors (though I think it just kind of looks like try/catch in C) but I don’t know of much else.

It would need to be the case that code that doesn’t want to handle errors (like Max’s simple website) doesn’t have any error handling code, but it’s easy to add, and common patterns (e.g. “retry this inner operation N times, maybe with back off and jitter, and then fail this outer operation, either exiting the program or leaving unaffected parts running”) are easy to express

rauhl 3 days ago||

Have you seen Common Lisp’s condition system? It’s a step above exceptions, because one can signal a condition in low-level code, handle it in high-level code and then resume back at the lower level, or anywhere in between which has established a restart.

https://gigamonkeys.com/book/beyond-exception-handling-condi... is a nice introduction; https://news.ycombinator.com/item?id=24867548 points to a great book about it. I believe that Smalltalk ended up using a similar system, too.

> It would need to be the case that code that doesn’t want to handle errors (like Max’s simple website) doesn’t have any error handling code, but it’s easy to add, and common patterns (e.g. “retry this inner operation N times, maybe with back off and jitter, and then fail this outer operation, either exiting the program or leaving unaffected parts running”) are easy to express

Lisp’s condition system can handle that! Here’s a dumb function which signals a continuable error when i ≤ 3:

    (defun foo ()
      (loop for i from 0
            do (if (> i 3)
                   (return (format nil "good i: ~d" i))
                   (cerror "Keep going." "~d is too low" i))))

If one runs (foo) by hand then i starts at 0 and FOO signals an error; the debugger will include the option to continue, then i is 1 and FOO signals another error and one may choose to continue. That’s good for interactive use, but kind of a pain in a program. Fortunately, there are ways to retry, and to even ignore errors completely.

If one wishes to retry up to six times, one can bind a handler which invokes the CONTINUE restart:

    (let ((j 0))
      (handler-bind ((error #'(lambda (c)
           (declare (ignore c))
           ;; only retry six times
           (unless (> (incf j) 6)
             (invoke-restart 'continue)))))
        (foo)))

If one wants to ignore errors, then (ignore-errors (foo)) will run and handle the error by returning two values: NIL and the first error.

msteffen 2 days ago||

I had heard CL’s error handling was different but didn’t understand the details. Thanks for the explanation!

immibis 2 days ago|||

Abstracting error checking pays huge dividends, then. In PHP, if something crashes, it continues running and outputs nonsense (probably alright for the simplest of sites but you should turn this off if your thing has any kind of authentication) or it stops processing the page. PHP implicitly runs one process per request (not necessarily an OS process); everything is scoped to the request, and if the request fails it can just release every resource scoped to the request, and continue on. You could do the same in a CGI script by calling exit or abort. With any platform that handles all concurrent requests in a single process, you have to explicitly clean up a bunch of stuff, flush and close the response, and so on.

There's a similar effect in transactional databases - or transactional anything. If you run into any problem, you just abort the transaction and you don't have to care about individual cleanup steps.

cratermoon 3 days ago|||

1. Define Errors Out of Existence https://wiki.tcl-lang.org/page/Define+Errors+Out+of+Existenc... 2. Treat errors not as something going wrong but as incomplete actions leading to alternate valid code paths.

On the second point, make errors part of the domain, and treat them as a kind of result outside the scope of the expected. Be like jazz musician Miles Davis and instead of covering up mistakes, make something wrong into something right. https://www.youtube.com/watch?v=FL4LxrN-iyw&t=183

WorldMaker 2 days ago||

In terms of developer ergonomics, try/catch seems among the best we've come up with so far. We want to focus on the success case and leave the error case as a footnote.

That's the simplicity argument here too: sometimes we only want to write the success case, and are happy with platform defaults for error reporting. (Another thing that PHP handled out-of-the-box because its domain was so constrained; it had started with strong default HTML output for error conditions that's fairly readable and useful for debugging. It's also useful for disclosure leaks which is why the defaults and security best practices have shifted so much from the early days of PHP when even php_info() was by default turned on and easy to run to debug some random cgi-bin server you were assigned by the hosting company that week.)

Most of the problems with try/catch aren't even really problems with that form of error handling, but with the types of the errors themselves. In C++/Java/C#/others, when an error happens we want stack traces for debugging and stack walks are expensive and may require pulling symbols data from somewhere else and that can be expensive. But that's not actually inherent to the try/catch pattern. You can throw cheaper error types. (JS you don't have to throw the nice Error family that does stack traces, you could throw a cheap string, for instance. Python has some stack walking tricks that keep its Exceptions somewhat cheaper and a lot lazier, because Python expects try/except to be a common flow control idiom.)

We also know from Haskell do-notation and now async/await in so many languages (and some of Rust's syntax sugar, etc) that you can have the try/catch syntax sugar but still power it with Result/Either monads. You can have that cake and eat it, too. In JS, a Promise is a future Either<ResolvedType, RejectedType> but in an async/await function you are writing your interactions with it as "normal JS" try/catch. Both can and do coexist in the same language together, it's not really a "battle" between the two styles, the simple conceptual model of try/catch "footnotes" and the robust type system affordances of a Result/Either monad type.

(If there is a war, it's with Go doing a worst of both worlds and not using a true flat-mappable Monad for its return type. But then that would make try/catch easy syntax sugar to build on top of it, and that seems to be the big thing they don't want, for reasons that seem as much obstinance as anything to me.)

naruhodo 2 days ago||

    files := r.MultipartForm.File["upload"]
    for _, file := range files {
        src, err := file.Open()
        filename := fmt.Sprintf("%d%s", imgNum, filepath.Ext(file.Filename))
        dst, err := os.Create(ORIGINAL_DIR + "/" + filename)
        _, err = io.Copy(dst, src)

Hmmm... can an attacker upload a file named "../../../etc/profile.d/script.sh" or similar ideas, i.e. path traversal?

marcofloriano 3 days ago||

"It's so simple that nothing goes wrong."

This. The hardest part of solving a problem is to think about the problem and then come up with the right solution. We actually do the opposite: we write code and then think about it.

riskable 2 days ago|

Speak for yourself! Some of us don't think about the problem at all, write the code, then don't think about it afterwards either!

This is how "features" get added to most Microsoft products these days :thumbsup:

ajd555 3 days ago||

What a great read! And so many good insights! It almost made me want to convert a project to PHP - perhaps I will for a smaller project.

I love the simplicity and some of the great tools that PHP offers out of the box. I do believe that it only works in some cases. I use go because I need the error handling, the goroutines and the continuously running server to listen for kafka events. But I always always try to keep it simple, sometimes preferring a longer function than writing a useless abstraction that will only add more constraints. This is a great reminder to double my efforts when it comes to KISS!

cratermoon 3 days ago||

A couple of hundred lines of code is going always to be easy to maintain unless it's purposely written in an obfuscated and confusing style. A project with only two maintainers in its lifetime isn't going to be subject to the kind of style meanderings that muck up a codebase that's gone through dozens of maintainers over its lifetime. A couple of thousand lines needs some organization, one function of two thousand lines is impenetrable.

tudorizer 3 days ago||

The gist of the article is a fun thought experiment.

Why count lines of code? Error handling is nothing to sniff at, especially in prod. Imagebin had a small handful of known users. Open it up to the world and most the error handling in Go comes handy.

For PHP, quite a bit was left on the shoulders of the HTTP server (eg. routes). The final result of Go is a binary which includes the server. The comparison is not fully fair, unless I'm missing something.

riskable 2 days ago|

Lines of Code has always been a tertiary indicator at best. It's supposed to only be used as a (very) rough indicator when you're trying to figure out the overall complexity of a project. As in, "there's 50 million lines of code in Windows."

Knowing a figure like that, you can reason that it's too big for a single developer. Therefore, you'll likely need at least two; and maybe a few thousand marketing people to sell it.

huhtenberg 3 days ago||

The title is wrong. It should've been

  PHP, a Real Programming Tool.

That's it. That's the story.

naikrovek 3 days ago|

It’s certainly useful, but without looking, what is the difference between these two methods in the standard library:

array_sort

sortArray

Even if you can answer that off the top of your head, consider how ridiculous it is that you needed to memorize that at some point. This is not the only example of such a thing a PHP dev needed to remember to be effective, either.

Any programming language can be wielded in a simple way. Perl, for example, is superior to PHP in every way that is important to me.

Go is as well, even though it’s slightly more verbose than PHP for the authors imagebin tool.

We don’t do things simply because we’ve all been taught that complexity is cool and that using new tools is better than using old tools and so on.

My employer creates pods in Kubernetes to run command line programs on a fixed schedule that could (FAR MORE SIMPLY) run in a cronjob on a utility server somewhere. And that cronjob could email us all when there is a problem. But instead I have to remember to regularly open a bookmark to an opensearch host with a stupidly complex 75-character hostname somewhere, find the index for the pod, search through all the logs for errors, and if I find any, I need to the dig further to get any useful contextual information about the failure … or I could simply read an email that cron automatically delivered directly to my inbox. We stumble over ourselves to shove more things like that into kubernetes every day, and it drives me nuts. This entire industry has lost its goddamned mind.

hnthrow90348765 3 days ago||

>This entire industry has lost its goddamned mind.

Yep, stay-with-the-fad pressures mean people need to farm experience using those fads.

It won't change until the industry is okay with slowing down

naikrovek 2 days ago||

This industry is STILL growing so fast that there is no negative selection pressure to weed out bad ideas. There’s no punishment at all for doing things in stupid or inefficient ways. There is only reward for completing the project.

I like doing things properly and almost no one else at my enormous employer does. Certainly no one on my team does, and it is extremely stressful. I feel like I am talking to a wall when I talk to my team members. No one understands. No one wants to.

Kostarrr 3 days ago|

Ok now I want to know. Does Max php code have security issues? Because especially in early straightforward PHP, those were all over the place. I vaguely remember PHP3 just injected query variables into your variables? But as $_GET is mentioned, this is probably at least not the case...

Retr0id 3 days ago||

Both versions have security issues if you're sufficiently paranoid, because they shell out to exiftool on untrusted input files without any sandboxing. Exiftool has had RCE flaws in the past, and will likely have them again.

But for a service with 1 user, it's fine.

fossa1 3 days ago||

[dead]

More comments...