Human-computer interfaces (HCIs) controlled by the eyes is not a novel concept. Research and development of systems that enable people incapable of operating keyboard- or mouse-based interfaces to use their eyes to control devices goes back at least to the 1970s. However, mass adoption of such interfaces has thus far not been necessary nor pursued by system designers with any particular ardor.
"I'm a little bit of a naysayer in that I think it will be very, very hard to design a general-purpose input that's superior to the traditional keyboard," says Michael Holmes, associate director for insight and research at the Center for Media Design at Ball State University. "When it comes to text, it's hard to beat the speed of a keyboard."
However, one class of user interfaces (UIs) in particular has proven to be problematic in the creation of comfortably sized keyboardsthe interfaces on mobile phones, which are becoming increasingly more capable computational platforms as well as communications devices. It might stand to reason that eye-based UIs on mobile phones could provide users with more options for controlling their phones' applications. In fact, Dartmouth College researchers led by computer science professor Andrew Campbell recently demonstrated with their EyePhone project that they could modify existing general-purpose HCI algorithms to operate a Nokia N810 smartphone using only the device's front-facing camera and computational resources. The new algorithms' accuracy rates, however, also demonstrated that the science behind eye-based mobile control needs more refinement before it is ready for mass consumption.
Eyes As an Input Device
Perhaps the primary scientific barrier to reaching a consensus approach to eye-controlled mobile interfaces is the idea that trying to design such an interface flies against the purpose of the eye, according to Roel Vertegaal, associate professor of human-computer interaction at Queen's University.
"One of the caveats with eye tracking is the notion you can point at something," Vertegaal says. "We didn't really like that. People tend not to point with their eyes. The eyes are an input device and not an output device for the body."
Vertegaal says this basic incompatibility between the eyes' intended function and the demands of using them as output controllers presents issues including the Midas Touch, postulated by Rob Jacob in a seminal 1991 paper entitled "The Use of Eye Movements in Human-Computer Interaction Techniques: What You Look At is What You Get."
"At first, it is empowering to be able simply to look at what you want and have it happen, rather than having to look at it (as you would anyway) and then point and click it with the mouse or otherwise issue a command," Jacob wrote. "Before long, though, it becomes like the Midas Touch. Everywhere you look, another command is activated; you cannot look anywhere without issuing a command."
Another issue caused by trying to make the eyes perform a task for which they are ill-suited is the lack of a consensus on how best to approach designing an eye-controlled interface. For example, one of the most salient principles of mainstream UI design, Fitts's Law, essentially states that the time to move a hand toward a target is affected by both the distance to a target and the size of the target. However, Fitts's Law has not proven to be a shibboleth among eyetracking researchers. Many contend the natural accuracy limitations of the eye in pointing to a small object, such as a coordinate on a screen, limit its applicability. A lack of consensus on the scientific foundation of eye control has led to disagreement on how best to approach discrete eye control of a phone. The Dartmouth researchers, for example, used blinks to control the phone in their experiment. However, Munich-based researcher Heiko Drewes found that designing a phone that follows gaze gestureslearned patterns of eye movement that trigger specific applications, rather than blinksresulted in more accurate responses from the phone.
"I tried the triggering of commands by blinking, but after several hundred blinks my eye got nervousI had the feeling of tremor in my eyelid," Drewes says. "I did no study on blinking, but in my personal opinion I am very skeptical that blinking is an option for frequent input. Blinking might be suitable for occasional input like accepting an incoming phone call."
However, Drewes believes even gaze gestures will not provide sufficient motivation for mass adoption of eye-controlled mobile phones. "The property of remote control and contact-free input does not bring advantage for a device I hold in my hands," he says. "For these reasons I am skeptical regarding the use of gaze gestures for mobile phones.
"In contrast, I see some chances for controlling a TV set by gaze gestures. In this case the display is in a distance that requires remote control. In addition, the display is big enough that the display corners provide helping points for large-scaled gesture, which are separable from natural eye movements."
Vertegaal believes the most profound accomplishment of the Dartmouth EyePhone work may be in the researchers' demonstration of a mobile phone's self-contained image and processing power in multiple realistic environments, instead of conducting experiments on a phone tethered to a desktop in a static lab setting. Dartmouth's Campbell concurs to a large degree.
"We did something extremely simple," Campbell says. "We just connected an existing body of work to an extremely popular device, and kind of answered the question of what do we have to do to take these algorithms and make them work in a mobile environment. We also connected the work to an application. Therefore, it was quite a simple demonstration of the idea."
Specifically, Campbell's group used eye-tracking and eye-detection algorithms originally developed for desktop machines and USB cameras. In detecting the eye, the original algorithm produced a number of false positive results for eye contours, due to the slight movement of the phone in the user's hand; interestingly, the false positives, all of which were based on coordinates significantly smaller than true eye contours, seemed to closely follow the contours of the user's face. To overcome these false positive results, the Dartmouth researchers created a filtering algorithm that identified the likely size of a legitimate eye contour. The new eye-detection algorithm resulted in accuracy rates of 60% when a user was walking in daylight, to 99% when the phone was steady in daylight. The blink detection algorithm's accuracy rate ranged from 67% to 84% in daylight.
Campbell believes the steady progress in increasing camera resolution and processing capabilities on mobile phones will lead to more accuracy over time. "Things like changes in lighting and movement really destroy some of these existing algorithms," he says. "Solving some of these context problems will allow these ideas to mature, and somebody's going to come along with a really smart idea for it."
Eye-based user interfaces on mobile phones could provide users with more options for controlling their phones' applications.
However, veterans of eye-tracking research do not foresee a wave of eyes-only mobile device control anytime soon, even with improved algorithms. Instead, the eye-tracking capabilities on mobile devices might become part and parcel of a more context-aware network infrastructure. A phone with eye-gaze context awareness might be able to discern things such as the presence of multiple pairs of eyes watching its screen and provide a way to notify the legitimate user of others reading over his or her shoulder. An e-commerce application might link a user's gaze toward an LED-enabled store window display to a URL of more information about a product or coupon for it on the phone. Campbell says one possible use for such a phone might be in a car, such as a dash-mounted phone that could detect the closing of a drowsy driver's eyes.
Ball State's Holmes says such multimodal concepts are far more realistic than an either/or eye-based input future. "Think about how long people have talked about voice control of computers," he says. "While the technology has gotten better, context is key. In an open office, you don't want to hear everybody talk to their computer. Voice command is useful for things like advancing slide show, but for the most part voice control is a special tool. And while I can see similar situations for eye gaze control, the notion that any one of these alternative input devices will sweep away the rest isn't going to happen. On the other hand, what is exciting is we are moving into a broader range of alternatives, and the quality of those alternatives is improving, so we have more choices."
Eye Gaze Tracking for Human Computer Interaction Dissertation, Ludwig-Maximilians-Universität, Munich, Germany, 2010.
Jacob, R. J. K.
The use of eye movements in human-computer interaction techniques: what you look at is what you get. ACM Transactions on Information Systems 9, 2, April 1991.
Majaranta, P. and Räihä, K.-J.
Twenty years of eye typing: systems and design issues. Proceedings of the 2002 Symposium on Eye Tracking Research & Applications, New Orleans, LA, March 2527, 2002.
Miluzzo, E., Wang, T. and Campbell, A. T.
EyePhone: activating mobile phones with your eyes. Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds, New Delhi, India, August 30, 2010.
Smith, J. D., Vertegaal R., and Sohn, C.
ViewPointer: lightweight calibration-free eye tracking for ubiquitous handsfree deixis. Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology, Seattle, WA, Oct. 2326, 2005.
©2010 ACM 0001-0782/10/1200 $10.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2010 ACM, Inc.