The iPhone looks, feels, and acts like no other phone, thanks to its device-sized screen and multitouch responsiveness to one and two-fingered gestures. Sweep a finger to the right to unlock its touch screen. Another finger sweep scrolls through all of its features, from applications to photos. Spread two fingers apart or draw them together on a list, photo, or newspaper article to zoom in or out. Press on the screen for a few seconds and the application icons start shaking as though they have become unglued, allowing a user to rearrange them. The judicious combination of simple finger gestures and screen animations, representing physical metaphors, make the iPhone's interface feel like a physical object that one can intuitively manipulate.
With the iPhone, Apple successfully brought together decades of research and dreams. However, one could imagine the iPhone, upon its introduction, faring poorly relative to the Nokia N95 smartphone. The iPhone features an inferior camera, a slower processor, a worse keyboard, no Flash player, and not even a way to attach a lanyard. Yet, the iPhone has been a huge hit. It boasts simple integrated function that introduced millions of consumers to a gestural, multitouch interface. Unlike previous technology deployments, the iPhone did not ask consumers what it can be useful for, but presents a suite of scenarios that a person typically uses several times a day, and its designers carefully matched use scenarios to user actions. So, while one must wrestle with different ways to use the myriad of functions on the Nokia N95, the iPhone user enlarges a roadmap by simply spreading two fingers apart.
Buoyed by the iPhone's success, companies are competing to show compelling multitouch scenarios that enable users to input, process, and display information in innovative ways, often involving finger and hand gestures. Jeff Han's company, Perceptive Pixel, has created a seven-and-a-half foot diagonal multitouch monitor, most prominently seen on CNN and known as its "Magic Wall." (Perceptive Pixel has also sold its multitouch monitors to a number of unspecified U.S. government agencies.) Han's wall-size monitors use a variation of the typical camera and projector mounted behind the display surface, with the camera mapping "frustrated" light in the projection surface to detect touch inputs.
In Perceptive Pixel's videos of its wall-size multitouch monitors, Han and others use finger and hand gestures to type on a virtual keyboard, call up a document, and scroll through its content; change the composition of a human face; and manipulate 3D objects by turning them any way a user desires. In one sequence, a user pulls up a distant image from Google Earth and zooms in on the image until it's revealed to be a street-level view of mid-town Manhattan, which the user pans and tilts; the user can then transform the Google Earth image into a computerized scale model of buildings and other street-level features.
Microsoft Surface uses five cameras and projector mounted beneath a table display surface, aided by a Windows Vista PC, to produce up to 52 points of touch on its 30-inch diagonal tabletop screen. As well as recognizing gestural input, Surface recognizes objects with RFID tags placed on them, allowing a user to transfer music files between an MP3 player and a smart-phone by placing each device on the computer's tabletop surface. Surface uses the presence of the object and the RFID to identify the type of device and its capabilities. The music files appear on the tabletop surface near each device, enabling a user to drag material from one device to the other.
The UnMouse Pad is a multitouch pad, similar to a mouse pad, created by Ken Perlin, a computer scientist at New York University, and several colleagues. One can use any object, such as a small block of wood, like a mouse on the UnMouse Pad, but more interestingly one can use multiple fingers to write and draw on the pad, creating the corresponding content on a computer screen.
An undergraduate student of Perlin's, for example, has used the UnMouse Pad for animation; one of the student's hands moved the "paper" around on a screen while his other hand drew on it. "You rethink how you interact with information," says Perlin. "It's much more human friendly."
The UnMouse Pad is part of Perlin's focus on creating innovative but low-cost technology. "Rather than use a large number of wires and expensive circuitry, we use a sparse set of force-sensing wires on the surfaceone wire every quarter of an inchand that's sufficient," he explains. "Our approach allows us to measure continuous position in the spaces between the wires, using simple and low-cost electronics."
Fortunately for Perlin and other researchers, there is now a large arsenal of relatively inexpensive motion-sensing equipment for gestural and multitouch input.
The most ubiquitous touch screens, such as ATMs and airline kiosks, measure a finger's position as it presses a layer of transparent indium tin oxide that is charged from the horizontal edges of the display to a layer that is charged from the vertical edges. Except for light pens that use the onscreen image to find where the tethered stylus is, most touch sensing today requires calibration. One can imagine a calibration-free technology that could be integrated into flat-panel displays. The approach could use the thin film circuitry in the display as an array of antennas. Modified driver chips could be used to measure the changes to the electrical field in patterns at different pixel locations in the display. And software could sort the changes in a high-frequency electrical environment of the local pixel circuits to locate where and how many fingers are touching the screen.
The iPhone's commercial successapproximately 10 million units sold and countinghas been a giant step forward for the mainstreaming of multitouch technology. "The iPhone is important," Perlin notes, "because it tells people that using multiple fingers to interact with computers via hand gestures is a natural, wonderful thing."
Apple, Microsoft, and Perceptive Pixel, among others, are investigating how to best use multiple fingers and hands for multitouch input.
Bill Buxton, a senior researcher for Microsoft Research and a pioneer in human-computer interaction and computer graphics, envisions a future of multitouch devices with their own specially designed operating systems and applications. "One solution I see," Buxton says, "is that we will start building new classes of computational devices that are not constrained by the legacy applications that were designed for a very different style of interaction."
Buxton believes future technology will create new relationships between typical consumer devices and multitouch screens. "What is really fascinating to me is when you combine the ability of not just the sense of touch of my fingers, but when different objectsa phone or a cameramakes a relationship with the use of my hands and gestures," he says. "This will lead to a convergence of multitouch surfaces and what is known as tangible computing."
New multitouch surfaces also mean new finger and hand gestures, and along with the development of multitouch operating systems and applications, Apple, Microsoft, and Perceptive Pixel are investigating how to best use multiple fingers and hands for multitouch input. "A lot of our research is coming up with gestures and manipulation metaphors," according to Han, such as how a CAD designer could manipulate multiple parts of an engine with only his or her hands.
As humans get their hands and fingers into the act of manipulating everything, they must be careful in their expectations of how novel the actions are as new expressive multitouch, gestural, and bimanual interfaces are developed. And researchers must learn from Apple that it chose a set of multifinger tracking techniques for the iPhone; its interface makes the act of showing a photo to another person into a visual spectacle. Researchers must develop scenarios of use that people want to perform, and scenarios can be better than simple, ergonomic, and productive; they can be memorable, socially positive, and robust for both novice and expert use.
"The way it will end up is we will get better, as a species, for having our computer interfaces match the richness that is already built into our brains and body," Perlin says. "You watch people use sign language, gesturing, manipulating tools, or playing guitar in the real world. We have the potential of writing software that captures the richness of human interaction that has evolved over millions of years. When it becomes successful, it won't seem exotic. It will be as astonishing to people that they used to rely on a mouse and keyboard as it was to people when they used to program through punch cards."
Hands and eyes are the special connections that the human brain has to the physical world. In the next few years, many great advances are possible, such as 3D multitouch interfaces. Three dimensions can make a huge difference. For instance, George Miller's famous paper about short-term memory holding seven, plus or minus two things demonstrated that people remember many more items in a random list of numbers, letters, and words in 3D than in 2D.
Although coordinated multitouch input, such as a two-handed, chordic stenographer keyboard, might be great for wordsand words are powerfula picture is truly worth a thousand words. Multitouch interfaces offer more exciting ways to search for and display images. They might also drive faster use of information. These interfaces are demonstrating natural 3D metaphors, allowing coordinated manipulation to replace what would be multiple actions with a cursor control. In the near future humans can look forward to merging simple hand gestures with rich feedback in a 3D interface to create display and control surfaces that are simple to use, increase productivity, and produce more socially positive experiences.
©2008 ACM 0001-0782/08/1200 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.