Skip Navigation
Text:
Increase font size
Decrease font size

    Vision Methods for Open-Universe Datasets

    Principal Investigator: Svetlana Lazebnik
    Funding Agency: US Department of the Interior  
    Agency Number: N11AP20010

    Abstract:

    The world is undergoing a digital imaging revolution, but our technological ability to acquire and store massive visual datasets is currently advancing much faster than our ability to make sense of them. The image collections that arise in today’s real-world application domains are huge, constantly evolving, unorganized, and loosely annotated. By contrast, existing recognition approaches remain trapped in a “closed universe” of small, static, cleanly annotated datasets containing just a few object classes. The proposed research will overcome these limitations and bring recognition into the era of “open-universe” photo collections by developing revolutionary techniques for three key areas of functionality: image search, interpretation, and prediction.

    (1) The PI will develop new high-efficiency techniques for similarity-based search and organization of datasets consisting of millions or billions of images. These techniques will work by compressing high-dimensional image descriptors to compact binary codes in a locality sensitive fashion, i.e., images that are similar either in terms of appearance or semantics should map to binary codes that have a low Hamming distance. The resulting binary code representation will reduce storage requirements and retrieval speed by orders of magnitude over existing approaches. The PI has already developed a binary coding technique competitive with the state of the art and demonstrated its promise for the application of reconstructing landmarks from Internet photo collections consisting of almost three million images. Proposed work includes the development of new methods to greatly improve precision and recall for ever smaller code sizes, and to incorporate noisy and incomplete semantic supervisory information into the code learning process.

    (2) The PI will design novel methods for image interpretation in “open-universe” datasets where users may continuously add new target object classes, exemplars, and annotations. The proposed image parsing framework will be able to efficiently segment all the objects in an image and identify their categories by taking advantage of fast similarity-based search techniques developed in (1). Apart from its scalability, a key innovation of the proposed framework is that it will perform comprehensivescene understanding by labeling each image region with multiple label types, including class identity, geometric surface orientation, material, motion type, etc.

    (3) The PI will develop scalable, dynamic methods for learning predictors for open-universe collections based on the framework of prediction with expert advice, which has not been previously applied to vision problems. The methods will be based on online maintainance of multiple diverse predictors, or experts, that specialize on different feature types or parts of the data space. Unlike standard prediction techniques, which require complete supervision, the proposed techniques will work in the much more realistic scenario where supervisory information (or feedback) is not given for every observation. For applications like target or anomaly detection, feedback may even be asymmetric, i.e., an analyst would manually examine only the images tagged by the system as positives. Active and multitask learning scenarios will also be considered. The proposed research directions have many potential applications relevant to the Department of Defense, including registration and querying of unorganized photos acquired by observers on the ground, change detection for incremental dataset updating, target and anomaly detection, automatic analysis of surveillance footage, and autonomous vehicle and robot navigation.

    Document Actions