Welcome to
Caches to Caches

This blog is devoted to the broad interests of Gregory J Stein, which includes topics such as Numerical Modeling, Web Design, Robotics, and a number of my individual hobby projects. If there's any article you would like to see, or something you've been wondering about, be sure to let me know on Twitter.


AlphaZero is incredible. If you have yet to read DeepMind's blog post about their recent paper in Science detailing the ins and outs of their legendary game-playing AI, I recommend you do so. In it, DeepMind's scientists describe an intelligent system capable of playing the games of Go, Chess, and Shogi at superhuman levels. Even legendary chess Grandmaster Garry Kasparov says the moves selected by the system demonstrate a "superior understanding" of the games. Even more remarkable is that AlphaZero, a successor to the well-known AlphaGo and AlphaGo Zero, is trained entirely via self-play — it was able to learn good strategies without any meaningful human input.

So do these results imply that Artificial General Intelligence is soon-to-be a solved problem? Hardly. There is a massive difference between an artificially intelligent agent capable of playing chess and a robot that can solve practical real-world tasks, like exploring a building its never seen before to find someone's office. AlphaZero's intelligence derives from its ability to make predictions about how a game is likely to unfold: it learns to predict which moves are better than others and uses this information to think a few moves ahead. As it learns to make increasingly accurate predictions, AlphaZero gets better at rejecting "bad moves" and is able to simulate deeper into the future. But the real world is almost immeasurably complex, and, to act in the real world, a system like AlphaZero must decide between a nearly infinite set of possible actions at every instant in time. Overcoming this limitation is not merely a matter of throwing more computational power at the problem:

Using AlphaZero to solve real problems will require a change in the way computers represent and think about the world.

Yet despite the complexity inherent in the real world, humans are still capable of making predictions about how the world behaves and using this information to make decisions. To understand how, we consider how humans learn to play games.


The modern revolution in machine learning and robotics have been largely enabled by access to massive repositories of labeled image data. AI has become synonymous with big data, chiefly because machine learning approaches to tasks like object detection or automated text translation require massive amounts of labeled training data. Yet obtaining real-world data can be expensive, time-consuming, and inconvenient. In response, many researchers have turned to simulation tools — which can generate nearly limitless training data. These tools have become fundamental in the development of algorithms, particularly in the fields of Robotics and Deep Reinforcement Learning.

This is the first post in a three-part series on the role of simulated image data in the era of Deep Learning. In this post, I discuss the significance of simulation tools in the field of robotics and the promise and limitations of photorealistic simulators.


It is a sort of running joke to blame cosmic rays for unrepeatable bugs in computer code. In reality, it's more likely that your RAM is bad or that your CPU has an implementation bug. Yet there remains something exciting — if terrifying — about the idea of light coming down from the heavens and flipping a single bit, rendering your compiled program unexpectedly useless. As such, the legend persists.

Readers who know me well may recall that my Master's thesis was in the field of high-peak-power lasers. Our lab studied, among other things, a process by which room-sized lasers could be used to coherently excite electrons and thereby convert light from infrared frequencies into X-rays. Since the conversion process was pretty weak, we needed extremely sensitive detectors that were capable of responding to single photons.

The long-term goal was the creation of an X-ray laser, which could be used for medical imaging; it's probably over a decade away from practical use.

The high sensitivity of the photodetector was a mixed blessing: we needed it to detect the weak signal produced by our experiment, but it amplified other sources of noise as well. During data collection, the detector would occasionally have massive spikes in the output readings, which would saturate parts of the image at the output. If we left the detector on for too long, the resulting image would be filled with scattered noise that could overwhelm our signal. Thermal fluctuations, nuclear radiation, and cosmic rays were all contributors to the difficulties in the experiment. There was nothing we could do to lower this noise floor: the universe was our enemy.

I did a bit of digging and found an article entitled Background Events in Microchannel Plates that suggests that cosmic rays make up only about 4% of background noise events for the detectors we used. A small factor for sure, but not insignificant either.


As the use of machine learning systems grows beyond the academic sphere, one of the more worrying features I have witnessed is a lack of understanding of how machine learning systems should be trained and applied. The lessons the AI community has learned over the last few decades of research are hard-earned, and it should go without saying that those who do not understand the inner workings of a machine learning tool risk having that system fail in often surprising ways.

This advice is not limited to AI. Using any stochastic system without an understanding of when or how it is likely to fail comes with inherent risk.

However, the potential advantages of AI are many, and using machine learning to accelerate your business, whether empowering employees or improving your product, may outweigh potential pitfalls. If you are looking to use machine learning tools, here are a few guidelines you should keep in mind:

  • Establish clear metrics for success.
  • Start with the simplest approach.
  • Ask yourself if machine learning is even necessary.
  • Use both a test and a validation dataset.
  • Understand and mitigate data overfitting.
  • Be wary of bias in your data.

Just over a two weeks ago, NVIDIA showcased vid2vid, their new technique for video-to-video translation. Their paper shows off a number of different applications including one particularly striking example in which the researchers automatically convert sketchy outlines of vlog-style videos from YouTube into compellingly realistic videos of people talking to the camera. The results are incredible and really need to be seen to be believed:

When most people hear the term "translation" they think of translating natural language: e.g. translating text or speech from Mandarin to English. Today I want to reinforce the idea that translation can be applied to different types of data beyond language. The vid2vid paper I mentioned above is just the latest and most visually striking example of the transformative power of AI, and modern machine learning is making incredibly rapid progress in this space.

Machine learning is, of course, an incredibly powerful tool for language translation. Recently, researchers from Microsoft achieved human-level translation performance on translating news articles from Mandarin to English.

In the remainder of this article, I will cover:

  • A brief definition of "translation" in the context of AI;
  • An overview of how modern machine learning systems tackle translation;
  • A list of application domains and some influential research for each.

Finding examples of "problematic" AI is relatively easy these days. Microsoft has inadvertently given rise to an unhinged, neo-nazi Twitter Bot while an AI beauty contest judge seems to strongly favor white women. Despite the sensational nature of these examples, they reflect a pervasive problem plaguing many modern AI systems.

Machine learning is designed to discover and exploit patterns in data so as to optimize some notion of performance. Most measures of good performance involve maximizing accuracy, yet this performance metric is often sufficient only for situations in which perfect accuracy can be achieved. When a task is difficult enough that the system is prone to errors, AI agents may fail in ways that we, as humans, may consider unfair or that take advantage of undesirable patterns in the data. Here, I discuss the issue of bias in AI and argue that great care must be taken to train a machine learning system to avoid systematic bias.

The notion of "perfect accuracy" is also simplistic in general. If an AI system is being used to screen candidates to hire, deciding how to define accuracy is already a value judgment.

In short, if you are a business professional looking to use some form of machine learning, you need to be aware of how bias can manifest itself in practice.

I've described at length how I use Emacs and Org as a project management tool. As part of my process, I frequently use Org as a lab notebook, in which I keep track of various bits of data and record both the code I run and various parameters I used in the process. My workflow requires (1) running code, (2) logging the results, and (3) including my own thoughts and analysis in between, a programming paradigm known more generally as literate programming.

A number of folks on Reddit and irreal.com have pointed out that I don't dive deep enough to really call the content in this post literate programming. Perhaps a more appropriate title would include Literate Scripting; regardless, the content I present here is still an integral part of my Emacs-based workflow.

Org makes it easy to asynchronously execute code for multiple programming languages (and even allows for remote code execution over ssh). For instance, on a recent project of mine I had a few shell scripts that I would occasionally run that would loop through some data files I was generating on a remote machine and return some statistics about them; Org makes it possible for me to do this without having to leave my notes. In this article, I'll go over a few use-cases that illustrate the utility of using Emacs with Org for coding projects and walk you through some of the functionality I couldn't live without.