From Large Scale Image Categorization to Entry-Level Categories

Vicente Ordonez, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg

University of North Carolina at Chapel Hill, Stanford University, Stony Brook University

Abstract

Entry level categories - the labels people will use to name an object - were originally defined and studied by psychologists in the 1980s. In this paper we study entrylevel categories at a large scale and learn the first models for predicting entry-level categories for images. Our models combine visual recognition predictions with proxies for word "naturalness" mined from the enormous amounts of text on the web. We demonstrate the usefulness of our models for predicting nouns (entry-level words) associated with images by people. We also learn mappings between concepts predicted by existing visual recognition systems and entry-level concepts that could be useful for improving human-focused applications such as natural language image description or retrieval.

Paper

Vicente Ordonez, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg. From Large Scale Image Categorization to Entry-Level Categories. IEEE International Conference on Computer Vision (ICCV), 2013. [Paper (PDF)] [Bibtex] [Slides]

Resources

Ground-truth entry-level categories for 447 imagenet categories: [click here]
Text-based entry-level translations for 7404 imagenet categories: [click here]
Results for predicting entry-level categories for 1000 images: [Dataset A][Dataset B]