Are no nudes good news?
This article first appeared in Personal Computer World magazine, September 1997.
IF THERE'S ONE SUBJECT guaranteed to get people hot under the collar, it's censorship of the Web. Most of us will be aware that there is material out there which we would not wish our children to see, and much that we would probably rather not look at ourselves. And most companies do not relish the thought of their employees spending the day engaged in browsing of the sweaty-palmed variety. Systems like NetNanny and CYBERSitter which automatically bar access to designated Web resources are a partial answer, but now software is becoming available which attacks the problem in a quite different way, by automatically analysing images for pornographic content. Automatic flesh detection is here.
Pictures must be very boring if you're a computer. Thanks to our miraculous visual system, humans see images in their entirety, somehow recognising the features in the image effortlessly. Software, however, must scan through the image pixel by pixel, usually working from top to bottom, and left to right, laboriously searching for patterns among the colours. Here's part of one row of pixels from an image I took from the Web, stored as a GIF file. The value of each pixel is the index into the image's colour palette:
7 4 4 10 14 15 15 15 11 5 4 7 1 7 5 7 11 13 15
Could you ever possibly recognise this as part of the Mona Lisa's left eye? Imagine if I were to describe a complete image to you in this fashion -- a huge rectangle of numbers. If you were forbidden to reconstruct the image again by plotting each pixel with the corresponding colour from the palette, how could you determine what the image was? It's an immensely hard problem, and one which has taxed image processing researchers for decades. However, there have recently been some interesting developments.
The Banbury-based company Microtrope recently released a program called ImageCensor, which it claims can scan an arbitary image displayed on a PC and determine if it is pornographic or not. If it thinks it is, the progran can store a thumbnail of the screen to a log file, together with the time and date, sound an audible alarm, and lock the computer until released by an administrator's password.
Having spent a few days evaluating ImageCensor, viewing many images from the Web (purely in the interests of research, you understand) I'm impressed by ImageCensor's performance. It does a very good job of spotting pornographic images. Microtrope's Philip Harris is understandably reluctant to describe in detail the algorithms used in ImageCensor, but he did confirm my suspicion that the program works by grabbing all the screen pixels every so often and analysing the overall colour balance of the image, looking for tones that match those of skin. It sounds like a remarkably simple approach, but it does seem to work very well. There must be more to it than that, because it rarely decided that an innocent head-and-shoulders portrait was pornographic, for example, but when faced with the real Mr and/or Ms McCoy in action, it got it right almost every time.
Microtrope is not alone in developing systems which try to find out what's in an image. Margaret Fleck and David Forsyth, respectively at the University of Iowa and the University of California at Berkeley, have developed a system they call "The Naked People Skin Finder". Despite the implausibility of the name, this is serious stuff, and Fleck and Forsyth have had their work featured at international computer conferences.
Their approach is more complex than Microtrope's, in that they use a two stage system: first they use a series of filters to analyse the colour balance of an image, trying to identify areas which have the appropriate tones and shininess for skin; then they attempt to find matches between groups of such areas against known human limb articulations, stored in a pre-computed database. The filtering and matching algorithms are sophisticated, and require significant processing power. You can find the details at www.cs.uiowa.edu/~mfleck.
While the recent hullabulloo about Web pornography has brought research into automatic picture recognition into the limelight, people have been working for some time on the general problem of querying databases of images. One established system is IBM's QBIC, developed in the early 1990s. QBIC lets a user set up a query by sketching where edges should roughly appear in the image, and what the overall colour composition should be, as percentages of red, green and blue. QBIC can then search a database for the best matches.
A more recent approach is "Multiresolution Image Querying", based on the technique of wavelet analysis for image encoding and compression (see Futures, PCW May 1997). Numinous Technologies Inc, based in Seattle, are developing such a system. At numinous.com you can find an interactive Java applet demo which lets you do a rough freehand sketch, and then searches for a match in an image database. The results are fast, and surprisingly accurate. I see in the Numinous system the seeds of the Web search interface of the future: freehand sketches combined with natural language queries. It's an exciting prospect.
However, and I assure you that I am not making this up for journalistic effect, as I was writing this article on my PC, with ImageCensor running silently in the background, my soundcard suddenly hooted. ImageCensor had singled out the Numinous pages as pornographic. While they do contain some shades of brown which might resemble a deep suntan, it does seem somewhat ironic. Could Numinous sue for loss of business? Or for defamation?
As the marriage of PC and TV moves inexorably closer, the Web will become as mainstream as the daily newspaper, and perhaps even replace it. We shall have to decide how much trust we wish to place in programs that claim to discern, on our behalf, the "meaning" of information. We might already be standing at the top of a slippery slope, down which we slide at our peril.
Toby Howard teaches at the University of Manchester.