The system provides tools for content-based searching for images and videos using color histograms generated from the visual scenes. Recent research has explored several types of features for content-based visual query. Certain feature sets are suited to particular application domains, for example, the management of satellite or medical imagery. We adopted color histograms in the prototype system in order to utilize a domain-independent approach. The content-based techniques developed here for indexing, searching and navigation can be applied, in principle, to other types of features and application domains.
The color histograms describe the distribution of colors in each image or video. We define the color histograms as discrete, 166 bin, distributions in a quantized HSV color space [5]. The system computes a color histogram for each image and video scene, which is used to assess its similarity to other images and video scenes. The color histograms are also used to automatically assign the images and videos to type classes using Fisher discriminant analysis, as described in Section 5.2.
The histogram dissimilarity function measures the weighted dissimilarity between histograms. For example, the quadratic distance between query histogram
and target histogram
is given by:
where
is a symmetric matrix and
denotes the similarity between colors with indexes i and j such that
. Note that the histograms are normalized such that
, where
.
In order to achieve high efficiency in the color histogram query process, we decompose the color histogram quadratic formula. This provides for efficient computation and indexing. By defining
,
and
, the color histogram quadratic distance is given as
By partitioning vector
into elements
's, the distance function can be approximated to arbitrary precision by setting
in
That is, any query for the most similar color histogram to
may be easily processed by storing and indexing individually
and
's, where
. Notice also that
is a constant of the query. The closest color histogram
is given as the one that minimizes
. By using the efficient computation described in Eq. 3, we are able to greatly reduce the query processing time, as demonstrated in Section 6.3.
By training on samples of the color histograms of images and videos, we developed a process of automated type assessment using Fisher discriminant analysis. Fisher discriminant analysis constructs a series of uncorrelated linear weightings of the color histograms that provide for maximum separation between training classes. In particular, the linear weightings are derived from the eigenvectors of the matrix given by the ratio of the between-class to within-class sum-of-square matrices for K classes [10]. New color histograms,
are then automatically assigned to nearest type class k where
and where
is the matrix of eigenvectors derived from the training classes and color histograms, and
is the mean histogram for class i. In Section 6.2, we show that this approach provides excellent automated classification of the images and videos into several broad type classes. We hope to further increase the number of type classes and improve the classification performance by incorporating other visual features into the process.
The user can best determine from the results of a query which images and videos are relevant and not relevant. The system can use this information to reformulate the query to better retrieve the images and videos the user desires [6]. Using the color histograms, relevance feedback is accomplished as follows: let
{relevant images/videos} and
{non-relevant images/videos} as determined by the user. The new query vector
at round k+1 is generated by
where
indicates normalization. The new images and videos are retrieved using
and the distance metric in Eq. 3. One formulation of relevance feedback assigns the values
, and
, which weights the positive and negative examples equally. The process of selecting the example images for content-based relevance feedback searching is illustrated in Figure 8(a). A simpler form of relevance feedback allows the user to select only one positive example in order to iterate the query process. In this case,
,
,
and
gives the new query vector directly from the selected image/video's color histogram,
as follows,
The system also provides a tool for the user to directly manipulate the image and video color histograms to formulate the search. Using the histogram manipulation tool, illustrated in Figure 8(b), the user may select one of the images or videos from the results and display its histogram. The user can then modify the histogram by adding or removing colors. The modified histogram is then used to conduct the next search. The new query histogram
is generated from a selected histogram
by adding or removing colors, which are denoted in the modifications histogram
Figure 8: (a) Relevance feedback search allows user to select both positive and negative examples, (b) histogram manipulation allows user to add and remove colors and adjust the color distribution for the next query.