3D Modelling and Technological Alchemy

Toby Howard

This article first appeared in the Times Higher Education Supplement, June 1997.

IT'S USUALLY GRATIFYING when undergraduates come to the front at the conclusion of a lecture, but recently I found I had a small-scale rebellion on my hands. I'd been describing how the computer-generated creatures were achieved in Jurassic Park, but some sceptical students didn't believe me. They suggested I'd got my video clips mixed up, and that what I claimed to be huge meshes of textured polygons, were in fact giant hydraulic robots with rubber skins. But I hadn't got my videos mixed up: the modelling and the computer graphics really were that convincing. From the millions of pixels, life -- of a sort -- had emerged.

As computer graphics prepares to enter its fifth decade, it's been suggested that all the fundamental problems have essentially been solved. If you've seen Jurassic Park, or almost anything from Hollywood recently, you may agree. If a computer can recreate those prehistoric creatures so realistically, and make them blend so seamlessly with the real world of the human actors, what more is there to say?

Traditionally, creating images using computers has involved a fusion of three quite separate processes: "modelling", "viewing" and "rendering". Modelling is the process of describing what should be in the picture (the shape of a pterodactyl's skeleton, for example), and usually involves constructing precise geometrical descriptions. The "viewing" process then applies the laws of perspective in an attempt to bridge the dimensional gap that arises from using a flat, 2D, display to draw a 3D object. Finally, rendering determines how the geometry is actually displayed (the texture and gloss of reptilian skin in moonlit rain, perhaps). The thrust of much computer graphics research in the past three decades has been on rendering, with astounding results. Any modern PC can now create synthetic images which are almost indistinguishable from photographs of the real world.

Research into modelling has often played second fiddle to interest in rendering. But the ubiquity of graphics software, and the increasing power of computers is now leading to renewed interest in creating and manipulating models, and in particular three-dimensional models of real-world objects.

The applications of 3D computer modelling are legion, bounded only by your imagination and, of course, your budget. But, it seems, everybody's doing it: archaeologists are arranging virtual field trips; doctors are rehearsing surgery with virtual body-parts; engineers are experimenting with virtual wind tunnels, all at a fraction of the cost of the real thing, and without anything physical to construct or damage.

3D modelling is a three-stage process: first, obtain your data and get it into the computer; second, use the computer to interact with it, just as if you were manipulating a real object, but without annoyances like gravity and material strength; and third, output or store the modified data. Or, the data might subsequently be made 'real' again, by automatically fabricating an object using, for example, a numerically-controlled milling machine.

This is fine in principle, but there is a major problem. Using current methods, creating 3D models is an extremely time-consuming, unreliable, and labour-intensive business. Models are often made of huge numbers of polygons, usually triangles, linked together into a mesh, rather like chicken-wire. It can take a huge number of polygons to capture geometrical detail. For example, Viewpoint Datalabs, one of the leading suppliers of off-the-shelf 3D model data, will sell you a detailed model of a bee (with hair) made from 129,802 polygons (without hair, it's 44,036 polygons).

Marshalling all these tiny shapes together to make the overall object is a daunting task, and there are a number of ways to go about it. If your model is very simple, you might be able to do it by hand, sketching out shapes on graph paper and reading off the coordinates. This is hard work. A more practical solution is to use a Computer Aided Design (CAD) package, of which there are many hundreds available. The best systems, such as AutoCAD, can greatly simplify the process of constructing models. However, the dimensionality problem remains: you're trying to build solid 3D objects, but can only ever see their flat images on your screen.

If you have an existing object whose geometry you wish to capture, a quite different approach is to "scan" it, like the Star Trek transporter scans its passengers, and read off a stream of numbers that describe its shape. Machines that do this can be very effective, and capturing the shape of 3D objects using laser scanners is an established technology. Cyberware, a leading US modelling company, sells turnkey systems, where the object is placed on a rotating platform, and illuminated by a low-power laser scanned rapidly across it. Two video cameras record the spots of laser light visible on the object, rather like following the path traced out by twirling sparklers on Guy Fawkes night, and software digitises the images to extract the shape information. Laser scanning allows the automatic capture of complex geometry to reasonably high resolution, and has wide application. Such a system can scan an entire human body to a resolution of 2 mm in 17 seconds, and marine biologists at Purdue University have used this technique to accurately measure fish skeletons to distinguish between species.

But what if you don't have the actual object you want to model, or if, like the Taj Mahal, it might not fit on the scanner platform. Here, photogrammetry, a long-established technique used in map-making, architecture, medicine and forensics, is especially useful. This involves photographing terrain or objects with calibrated cameras, and subsequently measuring the features directly from the photographs. Computer vision researchers have long been working on automating the process, to extract 3D structure from video images, and recent work at the University of California at Berkeley has resulted in a system which can semi-automatically recreate 3D architectural scenes from photographs. However, there is still no known general method for automatically deriving an accurate geometrical model of a scene from arbitrary photographs of it. Many researchers see this as the Holy Grail.

There are times when geometric modelling isn't appropriate. How do you write down a set of numbers which describe a flower, for example? Or a cloud, fireworks, a forest, or a snowstorm? For "fuzzy" objects like these, methods which can algorithmically generate geometry are necessary, such as particle systems, fractals, iterated function systems, and the marvellously-named technique of "blobby modelling".

But regardless of how you're created your model, you will probably wish to interact with it in some way, to edit its shape, change its surface properties, and so on. This is the point at which, in most state-of-the-art systems, all the 3D information you've taken such pains to capture, gets squashed down to a 2D image on the screen. Armed with a mouse rolling on a flat mat, we must again struggle to bridge the 2D/3D divide.

The ideal situation, of course, is to manipulate the 3D model in true 3D. Not only do we replace the flat screen with a stereoscopic display, so that we feel we are sharing the same space as the model, but we replace the desk-bound mouse with true 3D input devices, which we can hold in our hands and wave about in space. We're now in the realm of Virtual Reality, or VR.

Unfortunately, VR has got itself a bit of a bad name. It's suffered to some extent from the the same problems which plagued Artificial Intelligence in the 1970s, where research results simply failed to live up to the much-hyped promises. Just as people became disillusioned with the disappointing behaviour of programs intended to engage you in believable conversation, they remain unconvinced by the poor-quality imagery of VR headsets which according to the media promised the "ultimate virtual experience". "Phooey", most people said when they had a try.

It is certainly true that current affordable VR headsets do not offer wonderful image quality. But it can only get better, and the psychologically engaging interaction techniques pioneered by VR research groups worldwide really do promise exciting new ways to work with 3D models. There is also much scope in using large-screen stereoscopic displays, where multiple participants, unencumbered by special head-mounted displays, can work cooperatively in shared virtual environments across the Internet.

Perhaps the grandest challenge of all is to be able to scan an environment and automatically create a faithful representation of it inside a computer, where we can explore and manipulate it virtually, then export it from the machine, and make it real again.

Such technological alchemy is currently beyond our grasp, but I look forward to the day when it arrives, and with it the opportunity to once again attempt to convince my sceptical students.

Toby Howard teaches at the University of Manchester.