We introduced a new robust system that provides the essential function of cataloging the visual information on the Web. The system automatically collects the images and videos and catalogs them using both textual and visual information. We developed a web application that is very easy to use and provides great flexibility and functionality for browsing and searching for images and videos. In the initial implementation, the system has catalogued and provides searching through more than one half million images and videos.
In future work, we will utilize additional visual features, such as texture, shape and spatial layout, to further enhance the content-based components of the system. In particular, we are porting the VisualSEEk [11] system for joint feature/spatial querying to this application. We are also incorporating automated techniques for detecting faces [12] and text in images and videos.
We will also investigate new techniques for exploiting text and visual features independently and jointly to improve the process of cataloging the images and videos and automatically mapping them into subject and type classes. For example, better utilization of the text information in the parent Web pages may provide more information about the images/videos [9]. In addition, several recent approaches for learning from visual features are promising for detecting homogeneities within subject classes and improving the automated classification system. Finally, we will further expand and define the image and video subject taxonomy.