Posted by mooreds 10 hours ago
This is how basically all of the useful tests I've written have ended up working. (Including, yes, tests for an internal programming language.) The language is irrelevant, and the target system is irrelevant. All you need to be able to do is run something and capture its output somehow.
(You're not wrong to note that the first draft basic approach can still be improved. I've had a lot of mileage from adding stuff: producing additional useful output files (image diffs in particular are very helpful), copying input and output files around so they're conveniently accessible when sizing up failures, poking at test runner setup so it scales well with core count, more of the same so that it's easy to re-run a specific problem test in the debugger - and so on. But the basic principle is always the same: does actual output match expected output, yes (success)/no (fail).)
In this tc framework all it matters is the output of the command. The only part that must be customized to adapt to different languages is
result=$(command)
It's at https://github.com/ahoward/tc/blob/main/specs/002-we-need-a/...One-file-per testcase like `tc` does works, but it tends to fall apart a bit at large scale in my experience.