next up previous
Next: Subject Classification and Indexing Up: Searching for Images and Previous: Introduction

Image and Video Collection Process

 

The image and video collection process is conducted by several autonomous Web agents or spiders. The agents traverse the Web by following the hyperlinks between documents. They detect images and videos, download and process them and add the new information to the catalog. The overall collection process, illustrated in Figure 1, is carried out by several distinct spiders: (1) Spider 1 - assembles lists of candidate Web pages that may include images, videos or hyperlinks to them, (2) Spider 2 - extracts the URLs of the images and videos, (3) Spider 3 - retrieves and analyzes the images and videos.

 

figure46


Figure 1:   Image and video gathering process via three spiders.

Image and Video Detection

The first phase of the process consists of the two spiders that traverse the Web looking for images and videos, as illustrated in Figure 2. Starting from seed URLs, Spider 1 follows a breadth-first search across the Web. It downloads pages via the Hypertext Transfer Protocol (HTTP) protocol and passes the Hypertext Markup Language (HTML) code to Spider 2. In turn, Spider 2, detects new URLs, encoded as HTML hyperlinks, and adds them back to the queue of Web pages to be downloaded by Spider 1. In this sense, Spider 1 is similar to many of the conventional spiders or robots that follow hyperlinks in some fashion across the Web. [7].

 

figure63


Figure 2:   Spider 1 and Spider 2 traverse the Web and assemble lists of URLs of images and videos.

Spider 2 detects all hyperlinks in the Web documents and converts the relative URLs to absolute addresses. By examining the types of the hyperlinks and the filename extensions of the URLs, Spider 2 assigns each URL to one of several categories: image, video or HTML. The mapping between filename extensions and Web object type is given by the Multipurpose Internet Mail Extensions (MIME) content type labels, as illustrated in Table 1.

 

Extension Type
.gif Compuserve image format
.jpg, .jpeg, .jpe, .jfif, .pjpeg, .pjp JPEG image format
.qt, .mov, .moov Quicktime video format
.mpeg, .mpg, .mpr, .mpv, .vbs, .mpegv MPEG video format
.avi Microsoft video format
.htm, .html Hypertext Markup Language
Table 1:   MIME mapping between extensions and object types.

In the second phase, the list of image and video URLs from Spider 2 is input into Spider 3. Spider 3 retrieves the images and videos, processes them and adds them to the catalog. Three important functions of the Spider 3 are to

  1. extract visual features that allow for content-based techniques in searching, browsing and grouping,
  2. extract other attributes such as width, height, number of frames, type of visual data, and so forth,
  3. generate an icon, or motion icon, that sufficiently compacts and represents the visual information.
The tasks of Spider 3 are illustrated in Figure 3. The process of extracting visual features from the images and videos generates color histograms, which are discussed in Section 5. The other attributes of the images and videos populate the database tables, which are defined in Section 3.4. Finally, Spider 3 generates coarse and highly compressed versions of the images and videos to provide pictorial data in the query output.

 

figure96


Figure 3:   Spider 3 processes each image/video.

Image and Video Presentation

For images, the coarse versions are obtained by simply subsampling and compressing the originals where the compression format, either JPEG or GIF, is chosen to match the original image format. For video, the coarse versions are generated by subsampling the original video both spatially and temporally. The temporal subsampling is achieved in a two step process: first, one frame is kept every one second of video. Next, scene change detection is performed on the frames to detect the key frames of the sequence [8]. This allows for the elimination of duplicate scenes in the coarse version. Finally, the video is re-animated from the key frames and packaged as an animated GIF file. Upon retrieval from a query, the coarse videos appear to the user as animated samples of the original video.


next up previous
Next: Subject Classification and Indexing Up: Searching for Images and Previous: Introduction

John Smith
Fri Aug 16 11:09:46 EDT 1996