Computer scientists have invented a technique that automatically creates 3-D models of landmarks and geographical locations, using ordinary two-dimensional pictures available through Internet photo sharing sites like Flickr.
The technique creates the models using millions of images, processing them on a single personal computer in less than a day.
It was devised by a team of researchers from the University of North Carolina at Chapel Hill and the Swiss university, ETH-Zurich, led by Jan-Michael Frahm, research assistant professor of computer science in the UNC College of Arts and Sciences.
To demonstrate their technique, the researchers used the 3 million images of Rome available online to reconstruct all of the city's major landmarks. It took less than 24 hours on a single PC using commodity graphics hardware. They also reconstructed the landmarks of Berlin in the same manner. Their work can be viewed at Building Rome on a Cloudless Day, the project's website.
Frahm says the process provides a far richer experience and is an improvement of more than a factor of 1,000 over current commercial systems, such as Microsoft PhotoSynth, and alternative techniques developed by other researchers.
"Our technique would be the equivalent of processing a stack of photos as high as the 828-meter Dubai Towers, using a single PC, versus the next best technique, which is the equivalent of processing a stack of photos 42 meters tall—as high as the ceiling of Notre Dame—using 62 PCs," he says. "This efficiency is essential if one is to fully utilize the billions of user-provided images continuously being uploaded to the Internet."
View a video on the construction of 3-D models from photos of landmarks including the Trevi Fountain and the Brandenburg Gate.
One advantage of the 3-D models compared to viewing a video of a landmark is that the Internet photo collections used to construct them show the scene at different times and under different lighting and weather conditions, potentially creating a richer experience for viewers, he says. If video is available, however, the technology can utilize it as well, and using video shortens the processing time needed for reconstruction of the models.
Frahm says eventually the models could be embedded, for example, into common consumer applications such as Google Earth or Bing Maps, allowing users to explore cities from the comfort of their homes. Other applications could prove useful to travelers.
"You might be able to take a picture with your cell phone of a monument that would not only give you information about that monument, identifying it from the image, but could also tell you your location more precisely than even GPS," Frahm says.
He also notes that the technology could be a building block for disaster response software. For example, an aircraft could be sent to take video of the aftermath of a hurricane, and the resulting 3-D model could be used to assess damage from a remote location, saving time and money.
Frahm collaborated on the project with Marc Pollefeys, professor of computer science at ETH-Zurich and an adjunct professor at UNC, and Svetlana Lazebnik, assistant professor of computer science at UNC. They recently presented a paper on their research titled "Building Rome on a Cloudless Day" at the 11th European Conference on Computer Vision.
Video of the 3-D models and the processing technique: http://www.youtube.com/watch?v=4cEQZreQ2zQ
Video highlight: Rome landmarks (The Coliseum, Trevi Fountain and the Pieta): http://www.youtube.com/watch?v=4cEQZreQ2zQ#t=0m56s
Video highlight: Berlin landmarks (Brandenburg Gate, Berlin Cathedral, Ishtar Gate): http://www.youtube.com/watch?v=4cEQZreQ2zQ#t=2m36s
A screenshot of a 3-D model of the exterior of the Coliseum, Rome, Italy. Credit, Jan-Michael Frahm, UNC-Chapel Hill.