The goal of this challenge is to advance the area of learning knowledge and representation from web data. The web data not only contains huge numbers of visual images, but also rich meta information concerning these visual data, which could be exploited to learn good representations and models. We organize two tasks to evaluate the learned knowledge and representation: (1) WebVision Image Classification Task, and (2) Pascal VOC Transfer Learning Task. The second task is built upon the first task. Researchers can participate into only the first task, or both tasks.
News: A 10,000$ cash award will be given to the winners of the challenge!
The WebVision dataset is composed of training, validation, and test set. The training set is downloaded from Web without any human annotation. The validation and test set are human annotated, where the labels of validation data are provided but the labels of test data are withheld. To imitate the setting of learning from web data, the participants are required to learn their models solely on the training set and submit classification results on the test set. The validation set could only be used to evaluate the algorithms during development (see details in Honor Code). Each submission will produce a list of 5 labels in the descending order of confidence for each image. The recognition accuracy is evaluated based on the label which best matches the ground truth label for the image. Specifically, an algorithm will produce a label list: \(c_i\), \(i=1,...,5\) for each image and the ground truth labels of the image are: \(y_j\), \( j = 1,..., n \) with n class labels. The error of this prediction is defined as: $$E = \frac{1}{n} \sum_{j=1}^n \min_{i} d(c_i, y_j).$$ The \(d(c_i,y_j)\) is calculated as 0 if \(c_i=y_j\) and 1 otherwise. The final errors of the algorithm is the average corresponding error across all test images. For this version of the challenge, there is only one ground truth label for each image (i.e., \(n=1\)).
Click here to participate in the WebVision Image Classification TrackThis task is designed for verify the knowledge and representation learned from the WebVision training set on the new task. Hence, participants are required to submit results to the first task and transfer only models learned in the first task. We choose the image classification task of Pascal VOC 2012 [link to VOC 2012 page] to test the transfer learning performance. Participants could exploit different ways to transfer the knowledge learned in the first task perform image classification Pascal VOC 2012. For example, treating the learned models as feature extractors and learning the SVM classifier based on the features (Note that the model used in this transfer learning task has to be submitted to the WebVision Image Classification task for evaluation.). The evaluation protocol strictly follows the previous Pascal VOC. The participants are required to submit results in the Pascal VOC format to our server and we perform the evaluation by submitting these results to the Pascal VOC evaluation server.
The PASCAL VOC Transfer Learning Evaluation Server will be released soon.