WebVision: Visual Understanding by Learning from Web Data

The 4th Workshop on

Visual Understanding by Learning from Web Data 2020

June 15th, 2020

In Conjunction with CVPR 2020 , Seattle, Washington, U.S.

WebVision 2020 VIRTUAL

Winner Presentation - Image Track : Winner id: #2 (09:40 - 09:49)

Team Name: fISHpAM

Team Member: Canxiang Yan, Cheng Niu, Jie Zhou

Team Affiliation: Pattern Recognition Center, WeChat AI, Tencent Inc, China

Method Description (short) : We use pretraining and ensembling techniques to improve the performance. Using WordNet, each image can be mapped to several word tags (e.g., noun and adjective.). Then base models are pretrained with those multi-label images and different network archtectures.

Talk
Slides

Method Description : We use pretraining and ensembling techniques to improve the performance. Using WordNet, each image can be mapped to several word tags (e.g., noun and adjective.). Then base models are pretrained with those multi-label images and different network archtectures. Totally, there are 43 learned models. For ensembling, we use xgboost tool to dig the abilities of learned models with a part of training set. Other methods include large-scale finetuning, hard sampling and class-balanced sampling.

Entry Description :
entry1: ensemble all base models
entry2: ensemble all base models with large-scale finetuning
entry3: ensemble all base models with hard sampling
entry4: ensemble all base models with class-balanced sampling
entry5: ensemble all entries

Talk

Slides