Rank | Team Name | top-5 accuracy (%) | top-1 accuracy (%) |
---|---|---|---|
1 | Smart Image | 82.97 | 61.17 |
2 | fISHpAM | 82.01 | 59.76 |
3 | PCI-AI | 79.88 | 57.38 |
4 | AntVision | 77.37 | 53.93 |
Team name | Team member | Method description |
---|---|---|
Smart Image |
Lingxi Xie, Xiaopeng Zhang, Bingcheng Liu,
Zhao Yang, Zewei Du, Hang Chen, Longhui Wei, Yaxiong Chi
Huawei Cloud & Huawei 2012 Labs |
Our work is implemented on Huawei ModelArts platform [1], which slightly improves accuracy while being much faster in training.
As for the algorithms, the main idea is to leverage area under the margin and knowledge distillation for handling noise labels,
as well as an algorithm for learning an ensemble model. The details are as follows: a. We use different types of state-of-the-art network architectures, including ResNeXt、ResNeSt、seNet and SE-ResNeXt; b. We use the Area Under the Margin (AUM) algorithm [2] and knowledge distillation [3] for handling noise label; c. Curriculum learning strategy is used to refine the network many times; d. Training models with large resolution can improve model performance; e. During testing, we apply multi-scale and multi-crop to each test image; f. We also ensemble different models using different strategies. [1] What Is ModelArts? https://support.huaweicloud.com/en-us/productdesc-modelarts/modelarts_01_0001.html [2] Identifying Mislabeled Data using the Area Under the Margin Ranking. arXiv preprint arXiv:2001.10528 (2020). [3] Learning from Noisy Labels with Distillation. ICCV. 2017. Entry Description : Entry 1: model ensemble with weighted average Entry 2: model ensemble with different weights Entry 3: model distillation + model ensemble with weighted average Entry 4: model distillation + model ensemble with different weights Entry 5: model ensemble (heuristic algorithm) |
fISHpAM |
Canxiang Yan, Cheng Niu, Jie Zhou
Pattern Recognition Center, WeChat AI, Tencent Inc, China |
We use pretraining and ensembling techniques to improve the performance. Using WordNet, each image can be mapped to several word tags
(e.g., noun and adjective.). Then base models are pretrained with those multi-label images and different network archtectures.
Totally, there are 43 learned models. For ensembling, we use xgboost tool to dig the abilities of learned models with a part of training set.
Other methods include large-scale finetuning, hard sampling and class-balanced sampling.
Entry Description : entry1: ensemble all base models entry2: ensemble all base models with large-scale finetuning entry3: ensemble all base models with hard sampling entry4: ensemble all base models with class-balanced sampling entry5: ensemble all entries |
PCI-AI |
Zhiwei Wu, Shuwen Sun, Kunmin Li, Rui Zhang, Zhenjie Huang, Yanyi Feng
pcitech (https://www.pcitech.com/) |
Our method is based on the ResNet and ResNet variants, ResNet101,ResNet152[1], ResNext101[2] and ResNest101[3].
Due to limited resources, we use fp16, part of training samples and less training epochs to speed up.
We totaly trained 8 models. In the test stage, we use multi-scale,multi-crop and multi-model fusion.
[1] Kaiming He, Xiangyu Zhang, et al. Deep Residual Learning for Image Recognition, CVPR 2016 [2] Saining Xie, Ross Girshick, et al. Aggregated Residual Transformations for Deep Neural Networks, CVPR 2017 [3] HangZhang, et al. ResNeSt: Split-Attention Networks. arXiv:2004.08955 (2020) Entry Description : Entry 1: fusion 7 models with the highest validation accuracy Entry 2: fusion 7 models with different weights Entry 3: fusion all of the models Entry 4: fusion all of the models with different weights Entry 5: fusion of randomly selected models |