On Blaming Cosmic Rays

19 Nov 2018 Gregory J. Stein
Category
Tags Editorial

It is a sort of running joke to blame cosmic rays for unrepeatable bugs in computer code. In reality, it’s more likely that your RAM is bad or that your CPU has an implementation bug. Yet there remains something exciting — if terrifying — about the idea of light coming down from the heavens and flipping a single bit, rendering your compiled program unexpectedly useless. As such, the legend persists.

Readers who know me well may recall that my Master’s thesis was in the field of high-peak-power lasers. Our lab studied, among other things, a process by which room-sized lasers could be used to coherently excite electrons and thereby convert light from infrared frequencies into X-rays. Since the conversion process was pretty weak, we needed extremely sensitive detectors that were capable of responding to single photons.

The long-term goal was the creation of an X-ray laser, which could be used for medical imaging; it’s probably over a decade away from practical use.

The high sensitivity of the photodetector was a mixed blessing: we needed it to detect the weak signal produced by our experiment, but it amplified other sources of noise as well. During data collection, the detector would occasionally have massive spikes in the output readings, which would saturate parts of the image at the output. If we left the detector on for too long, the resulting image would be filled with scattered noise that could overwhelm our signal. Thermal fluctuations, nuclear radiation, and cosmic rays were all contributors to the difficulties in the experiment. There was nothing we could do to lower this noise floor: the universe was our enemy.

I did a bit of digging and found an article entitled Background Events in Microchannel Plates that suggests that cosmic rays make up only about 4% of background noise events for the detectors we used. A small factor for sure, but not insignificant either.

For us, cosmic rays were not just some laughable idea that could spontaneously cause our experiment to fail. Instead, cosmic radiation was one of the many inescapable factors that limited how precisely we could measure the output of our laser. Before we even started on the experiment, we needed to convince ourselves that the theoretical yield of our system would be above this noise floor. My experiences in that lab taught me an invaluable lesson:

Only by understanding what limits your performance can you hope to have a reasonable estimate of what is possible.

Fortunately, computers are designed to be robust to thermal noise and low levels of nuclear radiation and the likelihood of cosmic rays flipping bits on my machine is vanishingly small. Yet every experiment I run now as a roboticist is limited by something unavoidable. If my motion-planning algorithm runs poorly, it is easy to wish for a faster processor or a better camera. But, fundamentally, these limitations encourage us to improve — to make algorithms that are perhaps faster or more robust to motion blur.

It is worth noting that rogue interstellar neutrons have been proposed as an explanation for several otherwise-unexplainable electronics failures.

By the way, we did manage to detect a signal from our laser and complete the experiment. If you are curious, my paper is here.

What are your cosmic rays and how have you worked around them? I welcome your comments below or on Hacker News.