Welcome to
Caches to Caches

This blog is devoted to the broad interests of Gregory J Stein, which includes topics such as Numerical Modeling, Web Design, Robotics, and a number of my individual hobby projects. If there's any article you would like to see, or something you've been wondering about, be sure to let me know on Twitter.

As the use of machine learning systems grows beyond the academic sphere, one of the more worrying features I have witnessed is a lack of understanding of how machine learning systems should be trained and applied. The lessons the AI community has learned over the last few decades of research are hard-earned, and it should go without saying that those who do not understand the inner workings of a machine learning tool risk having that system fail in often surprising ways.

This advice is not limited to AI. Using any stochastic system without an understanding of when or how it is likely to fail comes with inherent risk.

However, the potential advantages of AI are many, and using machine learning to accelerate your business, whether empowering employees or improving your product, may outweigh potential pitfalls. If you are looking to use machine learning tools, here are a few guidelines you should keep in mind:

  • Establish clear metrics for success.
  • Start with the simplest approach.
  • Ask yourself if machine learning is even necessary.
  • Use both a test and a validation dataset.
  • Understand and mitigate data overfitting.
  • Be wary of bias in your data.

Just over a two weeks ago, NVIDIA showcased vid2vid, their new technique for video-to-video translation. Their paper shows off a number of different applications including one particularly striking example in which the researchers automatically convert sketchy outlines of vlog-style videos from YouTube into compellingly realistic videos of people talking to the camera. The results are incredible and really need to be seen to be believed:

When most people hear the term "translation" they think of translating natural language: e.g. translating text or speech from Mandarin to English. Today I want to reinforce the idea that translation can be applied to different types of data beyond language. The vid2vid paper I mentioned above is just the latest and most visually striking example of the transformative power of AI, and modern machine learning is making incredibly rapid progress in this space.

Machine learning is, of course, an incredibly powerful tool for language translation. Recently, researchers from Microsoft achieved human-level translation performance on translating news articles from Mandarin to English.

In the remainder of this article, I will cover:

  • A brief definition of "translation" in the context of AI;
  • An overview of how modern machine learning systems tackle translation;
  • A list of application domains and some influential research for each.

Finding examples of "problematic" AI is relatively easy these days. Microsoft has inadvertently given rise to an unhinged, neo-nazi Twitter Bot while an AI beauty contest judge seems to strongly favor white women. Despite the sensational nature of these examples, they reflect a pervasive problem plaguing many modern AI systems.

Machine learning is designed to discover and exploit patterns in data so as to optimize some notion of performance. Most measures of good performance involve maximizing accuracy, yet this performance metric is often sufficient only for situations in which perfect accuracy can be achieved. When a task is difficult enough that the system is prone to errors, AI agents may fail in ways that we, as humans, may consider unfair or that take advantage of undesirable patterns in the data. Here, I discuss the issue of bias in AI and argue that great care must be taken to train a machine learning system to avoid systematic bias.

The notion of "perfect accuracy" is also simplistic in general. If an AI system is being used to screen candidates to hire, deciding how to define accuracy is already a value judgment.

In short, if you are a business professional looking to use some form of machine learning, you need to be aware of how bias can manifest itself in practice.

I've described at length how I use Emacs and Org as a project management tool. As part of my process, I frequently use Org as a lab notebook, in which I keep track of various bits of data and record both the code I run and various parameters I used in the process. My workflow requires (1) running code, (2) logging the results, and (3) including my own thoughts and analysis in between, a programming paradigm known more generally as literate programming.

A number of folks on Reddit and irreal.com have pointed out that I don't dive deep enough to really call the content in this post literate programming. Perhaps a more appropriate title would include Literate Scripting; regardless, the content I present here is still an integral part of my Emacs-based workflow.

Org makes it easy to asynchronously execute code for multiple programming languages (and even allows for remote code execution over ssh). For instance, on a recent project of mine I had a few shell scripts that I would occasionally run that would loop through some data files I was generating on a remote machine and return some statistics about them; Org makes it possible for me to do this without having to leave my notes. In this article, I'll go over a few use-cases that illustrate the utility of using Emacs with Org for coding projects and walk you through some of the functionality I couldn't live without.

After making quick progress during a summer I spent doing research at Sandia National Labs before my senior year of college, I was invited (at the very last minute) to present at a conference one of my mentors was helping to organize. Rather than pay dues for the conference, I was flagged as a student volunteer. However, while the other 30 students were all pre-assigned specific tasks from a grid, as a late addition, I was not. Instead, I was told "just make yourself useful", with the expectation that I wouldn't do very much.

Having students staffing academic conferences is pretty common practice; they do not have to worry about paying for the registration and have an opportunity to interact directly with the high-profile researchers organizing the conference.

The reality was entirely different.

Even with so many deep learning papers coming out this year, there were a few publications I felt managed to rise above the rest. Here are the five papers that impacted my mental models the most over the last year. For each, I state the "goal" of the paper, briefly summarize the work, and explain why I found it so interesting.

It's no secret that I've spent longer on the aesthetic of this blog than I have on the content. I first created this website during a two-week-long effort to learn about web development over two years ago. Since then, I've toiled over the design of my posts and have become obsessed with visual design. I consider ways of maximizing the signal-to-noise ratio in everything I do.

Maximizing the signal-to-noise ratio is one of John-luc Doumont's universal principles of design that he outlines in his book, Trees, maps, and theorems.

As I've experimented with the layout of Caches To Caches, I've come to embrace one fundamental rule of effective communication:

Clarity is key.

This sounds simple enough, but you'd be surprised how difficult it is to get this right. For example, the last version of this blog was filled with unnecessary clutter—particularly the left sidebar. By drawing the eye away from the margin notes on the right, the sidebar distracts us from the content without contributing much. Instead, the new sidebar, revealed only in the presence of excess screen space, is isolated by color and avoids causing confusion:

Here you can see the dramatic change in the left sidebar in the new version of the site. Rather than have a table of contents which blends in with the text, it's now drawn with a dark background so as to visually isolate it from the center column. The sidebar quickly disappears as screen size shrinks, so as to prioritize the text.

Here is a collection of guidelines I've come up with to help you emphasize your content and avoid needless distractions.