Unbiased Metric Learning:
On the Utilization of Multiple Datasets and Web Images for Softening Bias
Chen Fang, Ye Xu and Daniel N. Rockmore
Dartmouth College


Many standard computer vision datasets exhibit biases due to a variety of sources including illumination condi-tion, imaging system, and preference of dataset collectors. Biases like these can have downstream effects in the use of vision datasets in the construction of generalizable tech-niques, especially for the goal of the creation of a classifi-cation system capable of generalizing to unseen and novel datasets. In this work we propose Unbiased Metric Learn-ing (UML), a metric learning approach, to achieve this goal. UML operates in the following two steps: (1) By varying hyperparameters, it learns a set of less biased can-didate distance metrics on training examples from multiple biased datasets. The key idea is to learn a neighborhood for each example, which consists of not only examples of the same category from the same dataset, but those from other datasets. The learning framework is based on structural SVM. (2) We do model validation on a set of weakly-labeled web images retrieved by issuing class labels as keywords to search engine. The metric with best validation performance is selected. Although the web images sometimes have noisy labels, they often tend to be less biased, which makes them suitable for the validation set in our task. Cross-dataset im-age classification experiments are carried out. Results show significant performance improvement on four well-known computer vision datasets.






  author    = {Chen Fang and Ye Xu and Daniel N. Rockmore},
  title     = {Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias},
  booktitle = {International Conference on Computer Vision},
  year      = {2013}


Bug report or any questions
Send an email to Chen Fang chenfang_[at]_cs_[dot]_dartmouth_[dot]_edu


This material is based upon work partly supported by AFOSR Award FA9550-11-1-0166 and the Neukom Institute for Computational Science. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.