The Gestural Interface

Toby Howard

This article first appeared in Personal Computer World magazine, February 1998.

Orange handOVER THE LAST FEW YEARS, Futures has reported on a number of promising new ideas for improving the human-computer interface. It's a fast moving field, and technologies not so long ago considered sci-fi, such as continuous speech recognition, or face and fingerprint analysis, are already becoming mainstream. Now there's a new approach to solving the problem. Instead of manipulating keyboards and mice, we shall simply make gestures to our computers. "Hand-waving" will take on a whole new meaning.

At this summer's SIGGRAPH conference in Los Angeles, the most prestigious annual event in the computer graphics world, gesture technology was making a big splash. In the Electric Garden, the showcase for new research projects, several systems based on gesture analysis were on display.

Sony were demonstrating a gestural interface for their PlayStation fighting game 'Tenshindo'. Developed in association with Pasadena's Holoplex company, the system uses video cameras to capture the movements of the two players' bodies. The video images are processed in real time to extract silhouettes which are then matched holographically -- exactly how isn't revealed -- against a set of standard moves which a game character can perform. The closest match is then used to control the character.

According to those who tried the system at the Electric Garden, the results were amazing: as their clumsy kicks and jumps were instantly transformed into the lightning-fast movements of the game characters, the players felt involved with the game action in a way they'd never experienced before. Although the gesture-controlled PlayStation exists only in prototype form, Sony hopes to have it on the market by early 1998.

Research into gestural interfaces has a long pedigree. The first attempts to capture gestures used a glove, wired up with sensors to detect the orientations of the hand and fingers. First developed in the late 1970s at the University of Illinois, the "dataglove" became commercially available in 1980 as a product from VPL, one of the pioneer Californian Virtual Reality companies. The VPL glove used optical fibres to measure finger bending, and an electromagnetic sensing system for hand orientation.

The "datasuit" soon followed, essentially a pair of instrumented long-johns, which allowed the positions of all the user's limbs to be tracked. (A redesigned version of the dataglove, using conductive ink and ultrasonic position sensing, was later marketed by Mattel as the "PowerGlove", for use with Nintendo video games. Although no longer manufactured, the PowerGlove made quite an impact and is still in widespread use by DIY Virtual Reality enthusiasts.)

But datagloves and datasuits come with an inherent drawback: because they have to be worn on the body, and connected to computers with wires or, more recently, radio links, these devices encumber the user, and hinder free expression. The ultimate gestural interface, if it is to ever succeed, must leave the user unencumbered.

Although work on unencumbered gestural interfaces is only recently making news, its gestation period goes back to the late 1960s. Artist and VR pioneer Myron Krueger effectively invented the idea in 1985 with his "Videoplace" installation, in which a participant faces a large video projection screen. Underneath the screen is a camera which records the participant's moving image, processed in real time to extract a silhouette which is then displayed on the screen. Other computer-generated images can also be displayed on the screen, and the system can detect interactions between parts of the participant's silhouette and the other on-screen graphics.

Krueger later extended the idea to create a "Videodesk", which allows users to interact with word processing and drawing programs using only hand and finger movements. Similar experiments have been conducted at Xerox's EuroPARC, resulting in the "Digital Desk". Here, a computer display is projected onto the surface of the desk, and a camera mounted directly above the desk monitors the user's hands and fingers as they point to parts of the the image and make control gestures.

Gestural interfaces also have huge potential for manipulating 3D graphics. While it's certainly possible to use a conventional mouse to navigate through a scene, or manipulate the objects within it, it's always awkward. And at SIGGRAPH's Electric Garden, ATR Telecommunications Research Laboratories was demonstrating its multiple-camera approach to recognising palm orientations and finger bending. With their system, an operator can control virtual objects in a 3D space simply by selecting an object by pointing at it, and then flexing their hands to change it shape.

With the news that future versions of Netscape's Communicator and Microsoft's Internet Explorer will have VRML display and interaction capabilities directly built in, consumers are soon going to demand better interfaces to 3D graphics, and gesture recognition offers at least a partial solution. In fact, products are already starting to appear: the General Reality Company has announced a new wireless dataglove together with Java software for platform-independent gestural control of 3D applications over the Internet

Only time will tell if gestural interfaces will take off. I rather hope they will: next time your PC freezes up or crashes, wouldn't it be wonderful to make a suitable gesture to it, knowing that for once it might understand exactly what you mean?

Toby Howard teaches at the University of Manchester.