Frequently Asked Questions
- Can I use the ImageNet images or the ImageNet pretrained models?
No. The main target of WebVision challenge is to push the envolope of learning visual representation without human annotations. So human annotated data is strictly prohibited to be used (Text data will be an exception). Therefore, ImageNet images or ImageNet pretrained models are not allowed to be used in any form.
- Can I use external images without human annotations?
No. For fairness, we restrict the challenge to use only WebVision training images. You are not allowed to use other web image datasets like YFCC100. You are not allowed to crawl web images by yourself, too.
- Can I use the text data (tags, description, caption) in the WebVision dataset?
Yes, and we encourage you to do so. It has shown in the literature that such textual information could provide useful supervision for training models.
- Can I use external text data, or models pretrained with external text data, with or without human annotation?
Yes, and we also encourage you to do so. This does not conflict with our target of learning visual representation without human annotationos. Therefore, WordNet, Knowledge Graph, etc. can be used. Models trained using external text data are also allowed, such as Word2Vec, BERT models, and so on. Note that the text data or models should be pulicly available. You should explicitly state in your final submission that what text datasets/models are used.
- Can I crawl text data according to WebVision concepts by myself, and use it as training data?
Yes. There is no restriction on non-visual data except the data should be publicly available. So people could reproduce the results. If you crawl text data by yourself, please clearly state it in your submission, and make it available to public before the final submission deadline. An URL should be provided in the method description part of your submission.
If you have other questions, please drop an email to webvisionworkshop AT gmail.com