What Gladwell is Missing: Institutional Memory

Malcolm Gladwell's recent article, ostensibly about how Davids can beat Goliaths, featured great praise for the full-court press as a method for less-skilled teams to beat more-skilled ones. He writes:

In the world of basketball, there is one story after another like this about legendary games where David used the full-court press to beat Goliath. Yet the puzzle of the press is that it has never become popular. People look at upsets like Fordham over UMass and call them flukes. Basketball sages point out that the press can be beaten by a well-coached team with adept ball handlers and astute passers—and that is true... Playing insurgent basketball did not guarantee victory. It was simply the best chance an underdog had of beating Goliath.

After his piece was rightly criticized all over the web, ESPN published a long and interesting email conversation between him and Bill Simmons that addressed the criticism. In it, Gladwell defended his column:

After my piece ran in The New Yorker, one of the most common responses I got was people saying, well, the reason more people don't use the press is that it can be beaten with a well-coached team and a good point guard. That is (A) absolutely true and (B) beside the point. The press doesn't guarantee victory. It simply represents the underdog's best chance of victory… I went to see a Lakers-Warriors game earlier this season, and it was abundantly clear after five minutes that the Warriors' chances of winning were, oh, no better than 10 percent. Why wouldn't you have a special squad of trained pressers come in for five minutes a half and press Kobe and Fisher?… Best case is that you rattle the Lakers and force a half-dozen extra turnovers that turn out to be crucial. And if you lose, so what? You were going to lose anyway.

Although lots of people have responded to this rebuttal, I haven't seen anyone mention what I consider to be an important reason that many teams have chosen not to implement full-court presses: Institutional Memory.

Let's imagine that the Warriors had in fact, put on the press, and that it had worked. The Warriors won against the vaunted Lakers! How would the team respond? They would press more. When that worked to win some more games, they would probably trade some of their players who were less well suited to the press to make room for some younger, faster, fitter players. They would win a few more games than they had been before. Everything's going good, right?

Then they hit a snag; No team has won the NBA Championship by pressing . The Warriors could optimize like crazy for the press, but they'd be training and working and trading and drafting to be, at best, a pretty good team. But no NBA team wants to be a pretty good team—they all want to win championships.

The mental exercise reveals what Gladwell has missed: the Warriors aren't playing to win that game, or even the most games that season. They're playing to try and win an NBA championship, and to do so, they need to spend years trying to build up the institutional memory of how to win games in the traditional way so that they can eventually beat every team, not just to win more games than they did playing traditionally.

All of this leads to a simple admonishment when analyzing organizations: don't expect that the organization is optimizing for what you think they're optimizing for. And when an organization seems to be acting in ways that are surprising to you, look for metrics that may be more important to them than the ones you expect.

May 31, 2009

The Bill Mill NCAA Bracket Randomizer

Short version: check out the bracket randomizer I wrote.

The Long Version

Each year, when the NCAA basketball tournament comes around, I end up in four or five pools, with a separate bracket filled out for each. I love the games, and I love having teams to root for, but I really hate the process of guessing to fill out my brackets. I inevitably pick too many upsets, just because I want to have fun rooting for underdogs; instead I end up bored after the first two rounds.

This year, I thought I could write some software to help me pick out my brackets. If I let the computer pick reasonably but randomly for each pool, I figure that I stand a better chance of having one decent bracket instead of the assortment of crappy ones I normally end up with.

So the last two nights, I wrote myself a bracket randomizer; just push the "randomize" button at the top and watch it go.

In order to pick what team will win a given game, it first calculates the chance each team will win by plugging Ken Pomeroy's ratings into the log5 formula. Then it picks a random number and compares it to the probability of the favorite winning; if the number is lower than that, it advances the favorite. Otherwise, it advances the underdog. Rinse and repeat, and you should have a reasonable random bracket for the whole tournament.

The Output

Next to each team in the bracket, you'll see three numbers in parentheses. These numbers represent, respectively, the team's Pythagorean rating, adjusted offensive efficiency, and adjusted defensive efficiency.

If that's Greek to you (groan), go check out Ken's explanation of what that means.

The color of each team, once you've randomized, represents their odds of winning. Brighter green is more of a favorite, deeper red more of an underdog. It should update the colors if you manually change the teams, but it won't; I just didn't have time to get everything done that I wanted to. Similarly, it won't update future games if you change the winner of an early one.

The Code

The surprisingly difficult part of this project was creating a simple HTML bracket that looked reasonable and allowed you to click to advance a team. I didn't get everything into the page that I wanted to, simply because I spent so much time just getting that done. (Keep in mind we're talking about a 2-night hack here).

The code to generate the bracket is contained in one super-ugly python file.

If you've got ideas for stuff to add, or want to generate a cooler looking bracket, or just check out the code, you can go get it at github. Feel free to fork and enjoy!

Mar 18, 2009

Image Programming in JavaScript: Converting to Monochrome

In part 1 of this series, we looked at how each pixel of an image is composed of three parts; red, green and blue, and showed how to make histograms to give a summary of each. Towards the end, I showed that they can be averaged in different ways to create a single histogram. In this article, we're going to look at the idea of mixing colors in more depth and show how we can use it to turn color images into monochrome in a variety of ways.

Some Terminology

In the last article, we talked about how each pixel is composed of a red, a green, and a blue component, and how each of those has a value between 0 and 255. What I didn't tell you then was that it's often useful to consider just the red components of pixels in an image, just the green components, or just the blue components.

A channel of an image I1 is an image I2 composed entirely of one component of I1

So when we talk about the red channel of an image, we're talking about the image that results from simply dropping the green and blue components of each of its pixels. It's best to see it in action:

The first image above can be entirely reconstructed from the last three monochrome images, using the first for the red channel, the second for the green channel, and the third for the blue. By the end of the article, we'll show the code for the demo that was used to convert the color image above into each of the monochrome ones.

Monochrome in Black and White

Before we discuss converting a color image into monochrome, it will help to understand a bit more about what we mean by monochrome.

Conceptually, we can consider each pixel of a monochrome image to consist of just one byte of information, instead of three bytes for red, green and blue as in a color image. This byte represents the brightness of a pixel, where 0 is black (no brightness) and 255 is white (no brightness). The values in between represent the grays, from the dark low numbers to the light high colors.

Since monochrome images are composed of values other than (r, g, b), let's update the working definition of a digital image that we used last time:

A Digital Image is a sequence of pixels, each of which is composed of one or more channels. The value of each channel represents its strength in that pixel.

So a monochrome image consists of only one channel, that representing the brightness of each pixel.

The Real World Intrudes

We've got a nice new mental model of a monochrome image, but the real world, as it tends to do, will complicate matters. We're working with color images on the <canvas> element, which means that each pixel needs to be composed of red, green, and blue. In order to display a monochrome image on a <canvas>, we'll need to convert each pixel from one channel, brightness, to three.

A convenient property of RGB images makes this transformation easy: any pixel made up of equal parts red, green, and blue will be gray. Therefore, to convert an image from monochrome to RGB, we simply use the brightness channel of each pixel in the monochrome image as all three channels in the RGB image.

Mixing it Up

Reversing the process, to create monochrome images from color ones, offers us a few more options. For every pixel, our task is to take three channels and condense them into one. The obvious thing to do is to simply average them; and indeed, this is exactly what happens by default in most photo editing programs when you convert an image to monochrome.

There's no reason that we need to limit ourselves to that transformation, though. We should consider other options because the average often produces flat, uninteresting pictures. Instead, we can mix the channels any way we want to produce the most interesting result possible.

I've put up a demo where you can play with the mixture of colors. Simply put numbers into each of the three inputs at the bottom of the page and hit "desaturate" to create a monochrome image where the channels have been weighted proportionally to the numbers you've entered. The histogram below the desaturate button will show you the effects of your mix.

Play with the values to find the result you find most pleasing, and notice how different the images that result from each mix can be.

Show Me The Goods

Here's the important parts of the code in the demo, with the fiddly bits stripped out:

function desaturate(rweight, gweight, bweight) {
  //normalize the color weights
  var scale = 1 / (rweight + gweight + bweight);
  rweight *= scale;
  gweight *= scale;
  bweight *= scale;

  each_pixel(image_data, function(r, g, b) {
    var brightness = r * rweight + g * gweight + b * bweight;

    //replace the r, g, and b values of the pixel with "brightness"
    return [brightness, brightness, brightness];
  });
}

//red channel only:
desaturate(1, 0, 0);

//green channel only:
desaturate(0, 1, 0);

//blue channel only:
desaturate(0, 0, 1);

//my favorite mix:
desaturate(5, 1, 4);

First we convert the color weights into percentages, then we multiply each component of each pixel times its weight to arrive at a new value for the pixel to take. Finally, we use the brightness value we've calculated as the value for all three channels to render the image in grayscale.

Conclusion

In this article, we've extended our definition of an image to include images with channels other than just red, green, and blue. We learned how to convert monochrome images for display in RGB, and looked at a couple different ways to convert back from RGB to monochrome. Finally, we looked at code to do the conversions we'd spent the whole article talking about.

Hopefully you have a pretty good grasp on what an image on a <canvas> is by now. If you want to download the code for the demos I've shown so far and play with it, go ahead and check it out at github.

Any comments, criticisms, or thoughts on what you'd like to see me write about, you can let me know by sending me an email.

Mar 05, 2009

Image Programming in JavaScript: The Histogram

Recently, I've spent a lot of my time taking photographs. When I get home from taking pictures, I immediately pop open Lightroom to import the images, pick out my favorites, do some adjustments on them, and publish them.

As a programmer, though, it was bothering me that I don't know exactly what's happening behind the scenes when I adjusted my images. What's really happening when I adjust the "saturation" slider for a photo? How does sharpening work, and what does its "amount" mean?

Sure, I could go read books to find out, but there's only way to really know what's happening: write a program to do it. This article represents part 1 of what will hopefully become a series on programming the digital image with JavaScript.

Wait, JavaScript?

Sure thing! With the recent adoption of the <canvas> element into modern browsers, JavaScript has gained the ability to load, display, and manipulate images at the pixel level. In addition, Jacob Seidelin recently created the Pixastic library to do the heavy DOM and canvas lifting for us.

As much as possible, I'll be using Pixastic as a base because JavaScript is a reasonably enjoyable programming language, the framework is new and simple, and I can show neat demos right in the browser. I'll also be using jQuery, because it makes writing cross-browser JavaScript much more pleasant.

Jumping Right In: What Is a Digital Image?

In the spirit of YAGNI, we're going to accept a superficial answer to this question, at least for now. We're basically going to pretend that all images are in color and represented in the same color space in the same way. Our provisional definition of a digital image is this:

A Digital Image is a sequence of pixels, each of which is represented by a 3-byte tuple (red, green, blue). The value of each element represents the strength of that color in that pixel.

This means that each pixel has a value between 0 and 255, where 0 represents absence and 255 full strength. For example, the tuple (255, 0, 0) would represent a pure red pixel, the tuple (0, 255, 0) a pure green one, and (0, 0, 255) a pure blue one.

Since we're mixing light1, the colors are additive, which means that they get lighter when mixed. Thus (255, 255, 0) represents a mixture of red and green which produces yellow, (0, 255, 255) represents green and blue combined to form magenta, and so on as you can see in the chart to the right.

You can think of it is as if you're in a dark room shining colored flashlights on a wall; if you don't shine any lights the wall remains black. If you shine all three colors on it, you get white. Thus, it makes sense that (0, 0, 0) represents black and (255, 255, 255) represents white.

OK, I get it. So what's a Histogram?

There are of course lots of colors in between the pure ones I talked about above, represented by the all the possible color tuples with values between 0 and 255. In order to understand an image at a glance, it's often helpful to see just how often each color occurs in that image. The histogram allows us to do just that.

To the right is the basic schematic of a histogram. The y-axis represents the frequency of each color value, which are represented on the x-axis. The left side of the histogram shows darker colors and the right side lighter.

The histogram of an image is a chart of how often each possible value or range of values for a color occurs in that image

Below is an image next to the histograms for its red, green, and blue values, respectively.

We can see that there are a lot of light blues, presumably in the the sky, a lot of midrange reds and greens, and not a whole lot of dark colors, though there is a spike at pure black. I won't go over what a histogram means for your photography; you should read what a better photographer has to say about that.

On To The Source

There are 2 major steps in creating histograms: gathering the data and drawing the histogram. To gather the data, we'll initialize three arrays with 256 slots, one array slot for each of the color values. Then we'll just loop through each pixel in the image and add one in the appropriate histogram slot for each color. That's it!

function array256(default_value) {
  arr = [];
  for (var i=0; i<256; i++) { arr[i] = default_value; }
  return arr;
}

var rvals = array256(0);
var gvals = array256(0);
var bvals = array256(0);

each_pixel(image_data, function(r, g, b) {
  rvals[r]++;
  gvals[g]++;
  bvals[b]++;
});

Where each_pixel is simply a function that loops through the image and passes the red, green, and blue value of each pixel of its first argument to the function passed as its second argument. What we have at the end of this code is three arrays, each containing the count of each possible value of one color in the image.

To simplify the display of these histograms, we'll draw on a <canvas> 256 pixels wide, so that each possible color occupies one pixel. Since our canvas is only 100 pixels tall, and any histogram value could be greater than 100, we'll scale each value as the percentage of the maximum value in the histogram.

//get a reference to the canvas to draw on
var ctx = $("#colorhistcanvas")[0].getContext("2d");
var rmax = Math.max.apply(null, rvals);
var bmax = Math.max.apply(null, bvals);
var gmax = Math.max.apply(null, gvals);

function colorbars(max, vals, color, y) {
  ctx.fillStyle = color;
  jQuery.each(vals, function(i,x) {
    var pct = (vals[i] / max) * 100;
    ctx.fillRect(i, y, 1, -Math.round(pct));
  });
}

colorbars(rmax, rvals, "rgb(255,0,0)", 100);
colorbars(gmax, gvals, "rgb(0,255,0)", 200);
colorbars(bmax, bvals, "rgb(0,0,255)", 300);

You can see this code in action at the top half of the histogram demo page. Note that the histograms on that page are being generated by JavaScript when you load the page, so you can look into the source and see exactly how the process works.

A Mean Feat

Most of the time, the three histograms are more information than we need. Instead, we want to be able to tell at a glance whether we've overexposed or underexposed the shot, and a single histogram can give us all the information we need. In this case, all we need is an average of the three histograms.

The obvious way to average the three histograms is to weight them all equally, sum each value, and divide by three. You'll see this histogram under "Average" on the demo page.

However, not all colors appear equally bright to human eyes, so the equally weighted histogram is commonly replaced with one more heavily weighted towards green, which appears brightest. A commonly given figure is 30% red, 59% green, and 11% blue, the results of which you can see in the "weighted average" histogram on the demo.

Conclusion

Hopefully, this article lays a solid base on which I can begin building up to show more complex and interesting image transformations with JavaScript and Pixastic. It should have given you a basic understanding of how an image is formed and how to build several different types of histograms from it. For a more detailed understanding, I encourage you to study the code given in the demo page; it should all be pretty simple.

If you have any questions or comments, please feel free to drop me an email.

1: As opposed to pigment, which is governed by subtractive color

Feb 26, 2009

Cognitive Bias

A few months ago, I was fortunate enough to be asked to give a presentation to Ignite Baltimore. At the time, I had just discovered the excellent Overcoming Bias, and I wanted to try my hand at showing some basic cognitive biases. I used the story of cholera outbreaks in London in the 1850s, as told by The Ghost Map, to try and describe a few of the cognitive biases that affect us in our daily lives:

Feb 24, 2009

next 5 »