Home

Chair's Message

Department Awards

Rebuilding The World's Landmarks in Six Days

Creating Magic From Code

Computer Science Tour Guide

Pearl Hacks Brings Women into Computer Science

Farewell to a Researcher, Teacher, and Mentor

50th Anniversary Celebration and Gala

Faculty Launch Consortium for Vision and Virtual Reality

Thank You For Your Generous Support!

Toolsmith Endowment Fund

Department News

Alumni News

Family Matters

In Memoriam

Recent Publications

The Back Page

Rebuilding the World’s Landmarks in Six Days


Red dots on the map represent locations of landmarks depicted
in photos from Yahoo's 100-million-image dataset

The Department of Computer Science and URC Ventures have built a 3D reconstruction of the world’s landmarks using computer vision and 3D modeling techniques. Using Yahoo’s publically available collection of 100 million crowd-sourced photos and a single PC, the 3D Computer Vision research group and URCV created a new software process able to build 12,903 3D models of some of the world’s greatest landmarks in just six days. Unlike maps and aerial images, these models can be directly used for VR applications such as virtual tourism. A demonstration was presented during the 2015 CVPR Computer Vision Conference in Boston, Massachusetts.

Graduate students Jared Heinly and Johannes L. Schönberger and professors Enrique Dunn and Jan-Michael Frahm of the 3D Computer Vision Group created software that improves upon earlier projects, including Building Rome on a Cloudless Day.

The group’s previous projects have built 3D models of landmarks from entire cities based on datasets of up to a few million images, but reconstructing the 3D models of the landmarks of the entire world requires the ability to process orders of magnitude more data.

The focus of the new framework is to enable processing on datasets of arbitrary size. The software streams each image consecutively, assigning it to a cluster of related images. The streaming process provides for greater scalability by analyzing each image only once. The key of the new algorithm is to efficiently decide which images to remember and which to discard as their information is already represented.

To test this method, the researchers applied the framework to Yahoo’s publicly available collection of 100 million crowd-sourced photos, containing images geographically distributed throughout the entire world. The program took 4.4 days to stream and cluster the entire 14-terabyte dataset on a single computer before building the 3D models of each of those sites.

Stacked on top of each other, Frahm says, these photos would reach into the middle of the stratosphere of the earth (twice as high as airplane cruising altitude).

Reichstag building in Berlin
Reichstag building in Berlin: red dots indicate locations from which
photographs were taken. The algorithm recognizes the landmark in the
photos, categorizes them appropriately, and constructs a sparse
3D model (shown by the black dots)

After the data association process is complete, sparse 3D models are built using the images. This process takes less than a day, bringing the total process to slightly more than five days for 100 million images.

The 3D Computer Vision Group partnered with URCV to reconstruct the data in ultra-high resolution, creative commons licensed images. URCV used the algorithm’s output to construct the 3D models via world-scale stereo modeling technology. Model results are based on URCV’s novel accuracy-driven view selection for precision scene reconstruction. To further improve the realism of the 3D scene models, a robust consensus-based depth map fusion is leveraged, along with an appearance correction. The world-scale stereo leverages a scalable, efficient, multi-threaded implementation for faster modeling.


Pantheon in Rome: this 3D reconstruction was made by
URCV using data from the 3D Computer Vision group

David Boardman, CEO of URCV, described industry applications: “For example, imagine imagery streaming in from UAV, planes, cell phones, truck mounted cameras, and hard hat cameras enabling the reconstruction of a construction and mining at any point in time. Or imagine imagery from the millions of self-driving cars in the future being leveraged to create up-to-the-second street maps. Think of the lives that would be saved if First Responders could see an up-to-the-minute model of an emergency scene before arriving.”

Heinly said that the 3D Computer Vision Group is planning an open-source release of the streaming software to the research community later this year.