Top
Best
New

Posted by speckx 1 day ago

How NASA built Artemis II’s fault-tolerant computer(cacm.acm.org)
579 points | 216 commentspage 2
dom111 10 hours ago|
I always wondered if the "radiation hardening" approaches of the challenges like this https://codegolf.stackexchange.com/questions/57257/radiation... (see the tag for more https://codegolf.stackexchange.com/questions/tagged/radiatio...) would be of any practical use... I assume not, as the problem is on too many levels, but still, seems at least tangentially relevant!
jbritton 20 hours ago||
I wonder how often problems happen that the redundancy solves. Is radiation actually flipping bits and at what frequency. Can a sun flare cause all the computers to go haywire.
EdNutting 20 hours ago||
Not a direct answer but probably as good information as you can get: https://static.googleusercontent.com/media/research.google.c...

Basically, yes, radiation does cause bit flips, more often than you might expect (but still a rare event in the grand scheme of things, but enough to matter).

And radiation in space is much “worse” (in quotes because that word is glossing over a huge number of different problems, both just intensity).

EdNutting 13 hours ago||
Typo: “both” ~ “not”
Tomte 13 hours ago|||
IEC 61508 estimates a soft error rate of about 700 to 1200 FIT (Failure in Time, i.e. 1E-9 failures/hour).

That was in the 2000s though, and for embedded memory above 65nm.

And obviously on earth.

tosapple 19 hours ago||
[dead]
kev009 12 hours ago||
Some people are claiming it's the good old RAD750 variant. Is there anything that talks about the actual computer architecture? The linked article is desperately void of technical details.
u1hcw9nx 11 hours ago|
It's a new (2002) variant of the same RAD750 architecture.

  CPUs:  IBM PowerPC 750FX (Single-core,  900 MHz, 32-bit, radiation hardened) 
  RAM:  256 MB (per processor)
  OS: VxWorks (Real-time OS)
  Network: TTEthernet (Time-Triggered Ethernet) at 1 Gbps
  programming: MISRA C++, flight control laws from Simulink adn MATLAB.
bharat1010 7 hours ago||
The part about triple-redundant voting systems genuinely blew my mind — it's such a different world from how most of us write software day to day, and honestly kind of humbling.
sebazzz 6 hours ago||
I wonder how the voting components are protected from integrity failures?
doublerabbit 6 hours ago||
The Hyperia roller coaster ride at Thorpe Park uses triple-redundant voting. Which I thought was cool.

> It’s a complex machine. There’s three computers all talking to each other for a start, and they have to agree on everything.

Primary, Real-Time Secondary and Third for regulating votes.

https://www.bbc.co.uk/news/articles/ckkknz9zpzgo

starkparker 1 day ago||
Headline needs its how-dectomy reverted to make sense
arduanika 20 hours ago|
(Off-topic:) Great word. Is that the usual word for it? Totally apt, and it should be the standard.
JumpCrisscross 14 hours ago||
Does anyone know how this compares to Crew Dragon or HLS?
guenthert 11 hours ago||
Multiple and dissimilar redundancy is nice and all that, but is there a manual override? Apollo could be (and at least in Apollo 11 and 13 it had to), but is this still possible and feasible? I'd guess so, as it's still manned by (former) test pilots, much like Apollo.
vhiremath4 16 hours ago||
> “Along with physically redundant wires, we have logically redundant network planes. We have redundant flight computers. All this is in place to cover for a hardware failure.”

It would be really cool to see a visualization of redundancy measures/utilization over the course of the trip to get a more tangible feel for its importance. I'm hoping a bunch of interesting data is made public after this mission!

PunchyHamster 6 hours ago||
I wonder how they made the voted-answer-picker fail-resistant
lrvick 12 hours ago|
NASA describes some impressive work for runtime integrity, but the lack of mention of build-time security is surprising.

I would expect to see multi-party-signed deterministic builds etc. Anyone have any insight here?

ranger207 6 hours ago|
What would the threat profile be here to require that? Regardless, I'd be a little surprised if they didn't have anything like that; provenance is very important in aerospace, with hardware tracked to the point that NTSB investigators looking at a crash can tell what ingot a bolt was made from
lrvick 1 hour ago||
In my experience government just uses RedHat which is -not- reproducible and -not- full source bootstrapped so a single person in the supply chain could maliciously or accidentally backdoor everything. Maybe the goal of the supply chain attacker is just embarrassing the Americans at best or cause a material loss of life at worst.

I would -hope- NASA does not trust their OS supply chains to a single person for high risk applications, but given even major companies I audit do this with billions of dollars on the line, it would not shock me if NASA has the same stance which worries me a bit.

They would need to be using something like heavily customized buildroot or stagex to produce deterministic OS images.

More comments...