16.October Montreal, Canada

AIM 2021

Advances in Image Manipulation workshop

in conjunction with ICCV 2021

Join AIM 2021 workshop online Zoom for LIVE, talks, Q&A, interaction

The event starts 16.10.2021 at 8:00 EDT / 12:00 UTC / 20:00 China time.
Check the AIM 2021 schedule.
No registration required.


Call for papers

Image manipulation is a key computer vision tasks, aiming at the restoration of degraded image content, the filling in of missing information, or the needed transformation and/or manipulation to achieve a desired target (with respect to perceptual quality, contents, or performance of apps working on such images). Recent years have witnessed an increased interest from the vision and graphics communities in these fundamental topics of research. Not only has there been a constantly growing flow of related papers, but also substantial progress has been achieved.

Each step forward eases the use of images by people or computers for the fulfillment of further tasks, as image manipulation serves as an important frontend. Not surprisingly then, there is an ever growing range of applications in fields such as surveillance, the automotive industry, electronics, remote sensing, or medical image analysis etc. The emergence and ubiquitous use of mobile and wearable devices offer another fertile ground for additional applications and faster methods.

This workshop aims to provide an overview of the new trends and advances in those areas. Moreover, it will offer an opportunity for academic and industrial attendees to interact and explore collaborations.

This workshop builds upon the success of Advances in Image Manipulation (AIM) workshop at ECCV 2020,ICCV 2019, Mobile AI (MAI) workshop at CVPR 2021 , Perceptual Image Restoration and Manipulation (PIRM) workshop at ECCV 2018 , the workshop and Challenge on Learned Image Compression (CLIC) editions at CVPR 2018, CVPR 2019, CVPR 2020, CVPR 2021 and the New Trends in Image Restoration and Enhancement (NTIRE) editions: at CVPR 2017 , 2018, 2019 , 2020 , and 2021 and at ACCV 2016. Moreover, it relies on the people associated with the PIRM, CLIC, MAI, AIM, and NTIRE events such as organizers, PC members, distinguished speakers, authors of published papers, challenge participants and winning teams.

Papers addressing topics related to image/video manipulation, restoration and enhancement are invited. The topics include, but are not limited to:

  • Image-to-image translation
  • Video-to-video translation
  • Image/video manipulation
  • Perceptual manipulation
  • Image/video generation and hallucination
  • Image/video quality assessment
  • Image/video semantic segmentation
  • Perceptual enhancement
  • Multimodal translation
  • Depth estimation
  • Image/video inpainting
  • Image/video deblurring
  • Image/video denoising
  • Image/video upsampling and super-resolution
  • Image/video filtering
  • Image/video de-hazing, de-raining, de-snowing, etc.
  • Demosaicing
  • Image/video compression
  • Removal of artifacts, shadows, glare and reflections, etc.
  • Image/video enhancement: brightening, color adjustment, sharpening, etc.
  • Style transfer
  • Hyperspectral imaging
  • Underwater imaging
  • Aerial and satellite imaging
  • Methods robust to changing weather conditions / adverse outdoor conditions
  • Image/video manipulation on mobile devices
  • Image/video restoration and enhancement on mobile devices
  • Studies and applications of the above.

AIM 2021 challenges

Due to time and resources constraints there are no AIM challenges this year.

Important dates

Workshop Event Date (always 5PM Pacific Time)
Paper submission deadline August 05, 2021
Paper decision notification August 12, 2021
Camera ready deadline August 16, 2021
Workshop day October 16, 2021


Instructions and Policies
Format and paper length

A paper submission has to be in English, in pdf format, and at most 8 pages (excluding references) in double column. The paper format must follow the same guidelines as for all ICCV 2021 submissions.
AIM 2021 and ICCV 2021 author guidelines

Double-blind review policy

The review process is double blind. Authors do not know the names of the chair/reviewers of their papers. Reviewers do not know the names of the authors.

Dual submission policy

Dual submission is allowed with ICCV2021 main conference only. If a paper is submitted also to ICCV and accepted, the paper cannot be published both at the ICCV and the workshop.

Submission site



Accepted and presented papers will be published after the conference in ICCV Workshops proceedings together with the ICCV2021 main conference papers.

Author Kit

The author kit provides a LaTeX2e template for paper submissions. Please refer to the example egpaper_for_review.pdf for detailed formatting instructions.



  • Radu Timofte, University of Wurzburg and ETH Zurich,
  • Andrey Ignatov, ETH Zurich,
  • Luc Van Gool, KU Leuven & ETH Zurich,
  • Ming-Hsuan Yang, University of California at Merced & Google,
  • Kyoung Mu Lee, Seoul National University,
  • Eli Shechtman, Creative Intelligence Lab at Adobe Research,
  • Kai Zhang, ETH Zurich,
  • Dario Fuoli, ETH Zurich,
  • Ming-Yu Liu, NVIDIA Research,
  • Andres Romero, ETH Zurich,
  • Martin Danelljan, ETH Zurich,
  • Egor Ershov, IITP RAS,
  • Marko Subasic, University of Zagreb,
  • Michael S. Brown, York University,

PC Members

  • Codruta O. Ancuti, UPT
  • Cosmin Ancuti, UPT
  • Siavash Bigdeli, CSEM
  • Michael S. Brown, York University
  • Jianrui Cai, Hong Kong Polytechnic University
  • Chia-Ming Cheng, MediaTek
  • Cheng-Ming Chiang, MediaTek
  • Sunghyun Cho, POSTECH
  • Martin Danelljan, ETH Zurich
  • Tali Dekel, Weirzmann Institute of Science
  • Chao Dong, SIAT
  • Majed El Helou, EPFL
  • Michael Elad, Technion
  • Raanan Fattal, Hebrew University of Jerusalem
  • Paolo Favaro, University of Bern
  • Graham Finlayson, University of East Anglia
  • Corneliu Florea, University Politechnica of Bucharest
  • Dario Fuoli, ETH Zurich
  • Felix Heide, Princeton University / Algolux
  • Hiroto Honda, Mobility Technologies Co.
  • Zhe Hu, Hikvision Research
  • Andrey Ignatov, ETH Zurich
  • Sing Bing Kang, Zillow Group
  • Samuli Laine, NVIDIA
  • Christian Ledig, VideaHealth
  • Jaerin Lee, Seoul National University
  • Kyoung Mu Lee, Seoul National University
  • Seungyong Lee, POSTECH
  • Victor Lempitsky, Skoltech & Samsung
  • Yawei Li, ETH Zurich
  • Stephen Lin, Microsoft Research
  • Ming-Yu Liu, NVIDIA Research
  • Vladimir Lukin, National Aerospace University
  • Vasile Manta, Technical University of Iasi
  • Zibo Meng, OPPO
  • Yusuke Monno, Tokyo Institute of Technology
  • Seungjun Nah, Seoul National University
  • Hajime Nagahara, Osaka University
  • Vinay P. Namboodiri, IIT Kanpur
  • Evangelos Ntavelis, CSEM & ETH Zurich
  • Federico Perazzi, Facebook
  • Wenqi Ren, Chinese Academy of Sciences
  • Antonio Robles-Kelly, Deakin University
  • Andres Romero, ETH Zurich
  • Aline Roumy, INRIA
  • Samuel Schulter, NEC Labs
  • Nicu Sebe, University of Trento
  • Eli Shechtman, Creative Intelligence Lab at Adobe Research
  • Gregory Slabaugh, Queen Mary University of London
  • Sabine Süsstrunk, EPFL
  • Yu-Wing Tai, HKUST
  • Hugues Talbot, Université Paris Est
  • Masayuki Tanaka, Tokyo Institute of Technology
  • Hao Tang, ETH Zurich
  • Jean-Philippe Tarel, IFSTTAR
  • Ayush Tewari, Max Planck Institute for Informatics
  • Qi Tian, Huawei Cloud & AI
  • Radu Timofte, ETH Zurich and University of Wurzburg
  • George Toderici, Google
  • Luc Van Gool, ETH Zurich & KU Leuven
  • Ting-Chun Wang, NVIDIA
  • Xintao Wang, The Chinese University of Hong Kong
  • Pengxu Wei, Sun Yat-Sen University
  • Ming-Hsuan Yang, University of California at Merced & Google
  • Ren Yang, ETH Zurich
  • Wenjun Zeng, Microsoft Research
  • Kai Zhang, ETH Zurich
  • Richard Zhang, UC Berkeley & Adobe Research
  • Yulun Zhang, Northeastern University
  • Jun-Yan Zhu, Adobe Research & CMU
  • Wangmeng Zuo, Harbin Institute of Technology

Invited Talks (TBU)

Phillip Isola


Title: Generative data as a substrate for visual analysis

Abstract: Data sampled from generative models is not quite like regular data -- you can navigate, manipulate, and optimize it through latent space controls. This ability has led to countless impressive graphics applications, but the utility of generative data for vision is less explored. In this talk, I will present two projects on using generative data for vision problems, both of which involve hallucinating counterfactual data to answer the question "what would it have looked like if ...?" In the first project, we explore how these hallucinations can act as data augmentation: alternative imagined views of an image can be ensembled to boost recognition performance. In the second project, we train a GAN to explain a classifier. Samples from the GAN can be manipulated to yield counterfactual explanations of the form "here is how the image would have to change in order for the class prediction to change." This system can surface attributes and biases that underlying deep net decisions, and may have use for scientific and medical discovery as well.

Bio: Phillip Isola is an assistant professor in EECS at MIT studying computer vision & graphics, machine learning, and AI. He completed his Ph.D. in Brain & Cognitive Sciences at MIT, followed by a postdoc at UC Berkeley and a year at OpenAI. His current research focuses on generative modeling, representation learning, embodied AI, and multiagent intelligence.

Chao Dong

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

Title: Interpreting Super Resolution Networks

Abstract: Although super resolution (SR) networks have achieved remarkable success in performance, their working mechanisms are still mysterious. Little attempts have been made in the interpretability of low-level vision tasks. In this talk, we will try to interpret SR network in three aspects – pixel, feature and filter. In the pixel level, we propose a new attribution approach called local attribution map (LAM). It could detect which input pixels contribute most for an output region. In the feature level, we have successfully discovered the “semantics” in SR networks, called deep degradation representations (DDR). We also reveal the differences in representation semantics between classification and SR networks. In the feature level, we develop a new diagnostic tool – Filter Attribution method based on Integral Gradient (FAIG) -- to find the most discriminative filters for degradation removal in blind SR networks. Our findings can not only help us better understand network behaviors, but also provide guidance on designing more efficient architectures.

Bio: Chao Dong is currently an associate professor in Shenzhen Institute of Advanced Technology, Chinese Academy of Science. He received his Ph.D. degree from The Chinese University of Hong Kong in 2016. In 2014, he first introduced deep learning method -- SRCNN into the super-resolution field. This seminal work was chosen as one of the top ten “Most Popular Articles” of TPAMI in 2016. His team has won several championships in international challenges –NTIRE2018, PIRM2018, NTIRE2019, NTIRE2020 and AIM2020. He worked in SenseTime from 2016 to 2018, as the team leader of Super-Resolution Group. His current research interest focuses on low-level vision problems, such as image/video super-resolution, denoising and enhancement.

Matthias Niessner

Technical University of Munich

Title: Neural Rendering and Beyond


Bio: Dr. Matthias Nießner is a Professor at the Technical University of Munich where he leads the Visual Computing Lab. Before, he was a Visiting Assistant Professor at Stanford University. Prof. Nießner’s research lies at the intersection of computer vision, graphics, and machine learning, where he is particularly interested in cutting-edge techniques for 3D reconstruction, semantic 3D scene understanding, video editing, and AI-driven video synthesis. In total, he has published over 90 academic publications, including 22 papers at the prestigious ACM Transactions on Graphics (SIGGRAPH / SIGGRAPH Asia) journal and over 30 works at the leading vision conferences (CVPR, ECCV, ICCV); several of these works won best paper awards, including at SIGCHI’14, HPG’15, SPG’18, and the SIGGRAPH’16 Emerging Technologies Award for the best Live Demo. Prof. Nießner’s work enjoys wide media coverage, with many articles featured in main-stream media including the New York Times, Wall Street Journal, Spiegel, MIT Technological Review, and many more, and his was work led to several TV appearances such as on Jimmy Kimmel Live, where Prof. Nießner demonstrated the popular Face2Face technique; Prof. Nießner’s academic Youtube channel currently has over 5 million views. For his work, Prof. Nießner received several awards: he is a TUM-IAS Rudolph Moessbauer Fellow (2017 – ongoing), he won the Google Faculty Award for Machine Perception (2017), the Nvidia Professor Partnership Award (2018), as well as the prestigious ERC Starting Grant 2018 which comes with 1.500.000 Euro in research funding; in 2019, he received the Eurographics Young Researcher Award honoring the best upcoming graphics researcher in Europe. In addition to his academic impact, Prof. Nießner is a co-founder and director of Synthesia Inc., a brand-new startup backed by to empower storytellers with cutting-edge AI-driven video synthesis.

Robby T. Tan

Yale-NUS College and ECE, National University of Singapore

Title: Image Decomposition in Image Restoration: Bad Weather, Nighttime, and Shadows

Abstract: Up to now, many computer vision algorithms are largely designed with the assumption that the input image or video are clean and not degraded. This assumption is mostly not true. Since, significant degradation can frequently happen, particularly in outdoor scenes. Nighttime and low light can cause significant noise, blur, low contrast, low color saturation, glare/floodlight, non-uniform light distributions, multiple light colors, etc. Bad weather such as rain, fog, haze, snow, etc, can impair background scene information significantly. Shadows, including soft and hard shadows can degrade visual information, and thus affect the performance of computer vision algorithms. In this talk, we will focus on the roles of image decomposition in dealing with all these types of degradation. We will show that the underlying problems of many image restoration is the ability to decompose physical components robustly.

Bio: Robby Tan is associate professor at Department of Electrical and Computer Engineering, National University of Singapore (NUS). His main research is in computer vision and deep learning, particularly in the domains of low level vision (bad weather/nighttime, color analysis, physics-based vision, optical flow, etc.), human pose/motion analysis, and applications of deep learning in healthcare. He received his PhD from The University of Tokyo.

Andrey Ignatov

AI Benchmark Project Lead, ETH Zurich

Title: Deep Learning on Smartphones, an In-Depth-Dive: Frameworks and SDKs, Hardware Acceleration with NPUs and GPUs, Models Deployment, Performance and Power Consumption Analysis

Abstract: In this tutorial, we will cover all basic concepts, steps and optimizations required for efficient AI inference on mobile devices. First, you will get to know how to run any NN model on your own smartphone in less than 5 minutes. Next, we will review all components needed to convert and run TensorFlow or PyTorch neural networks on Android and iOS smartphones, as well as discuss the key optimizations required for fast ML inference on the edge devices. In the second part of the talk, you will get an overview of the performance of all mobile processors with AI accelerators released in the past five years. We will additionally discuss the power consumption of the latest high-end smartphone SoCs, answering the question of why NPUs and DSPs are so critical for on-device inference.

Bio: Andrey Ignatov is the project lead of the AI Benchmark initiative targeted at performance evaluation of mobile, IoT and desktop hardware at ETH Zurich, Switzerland. His PhD research was aimed at designing efficient image processing models for smartphone NPUs / DSPs and developing the next-generation deep learning based mobile camera ISP solution. He is lecturer on Deep Learning for Smartphone Apps course at ETH Zurich and the main author of the AI Benchmark papers describing the current state of deep learning and AI hardware acceleration on mobile devices. He is a co-founder of the AI Witchlabs and co-organizer of the Mobile AI, NTIRE and AIM events. His main line of research is focused on image restoration and automatic image quality enhancement, adaptation of AI applications for mobile devices and benchmarking machine learning hardware.

Jun-Yan Zhu

Carnegie Mellon University

Title: Human-in-the-loop Model Creation

Abstract: The power and promise of deep generative models lie in their ability to synthesize endless realistic, diverse, and novel content with user controls. Unfortunately, the creation of these models demands high-performance computing platforms, large-scale annotated datasets, and sophisticated knowledge of deep learning. These requirements make it a process not feasible for many visual artists, content creators, small business entrepreneurs, and everyday users. In this talk, I describe our ongoing efforts in building user interfaces and optimization algorithms for humans to customize and create generative models with minimal user efforts. Our methods can quickly change the model weights of GANs and NeRF according to simple user inputs. The resulting models not only match the user instruction but also preserve the original model's sampling diversity and visual quality.

Bio: Jun-Yan Zhu is an Assistant Professor with the School of Computer Science of Carnegie Mellon University. Prior to joining CMU, he was a Research Scientist at Adobe Research and a postdoctoral researcher at MIT CSAIL. He obtained his Ph.D. from UC Berkeley and his B.E. from Tsinghua University. He studies computer vision, computer graphics, computational photography, and machine learning. He is the recipient of the Facebook Fellowship, ACM SIGGRAPH Outstanding Doctoral Dissertation Award, and UC Berkeley EECS David J. Sakrison Memorial Prize for outstanding doctoral research. His co-authored work has received the NVIDIA Pioneer Research Award, SIGGRAPH 2019 Real-time Live Best of Show Award and Audience Choice Award, and The 100 Greatest Innovations of 2019 by Popular Science.

Arun Mallya

NVIDIA Research

Title: GANcraft - an unsupervised 3D neural method for world-to-world translation

Abstract: Advances in 2D image-to-image translation methods, such as SPADE/GauGAN, have enabled users to paint photorealistic images by drawing simple sketches similar to those created in Microsoft Paint. Despite these innovations, creating a realistic 3D scene remains a painstaking task, out of the reach of most people. It requires years of expertise, professional software, a library of digital assets, and a lot of development time. Wouldn’t it be great if we could build a simple 3D world made of blocks representing various materials, feed it to an algorithm, and receive a realistic looking 3D world featuring tall green trees, ice-capped mountains, and the blue sea? This talk will provide an overview of GANcraft, an unsupervised neural rendering framework for generating photorealistic images of large 3D block worlds. GANcraft builds upon prior work in 2D image synthesis and 3D neural rendering to overcome the lack of paired training data between user-created block worlds and the real world, and allow for user control over scene semantics, camera trajectory, and output style.

Bio: Arun Mallya is a Senior Research Scientist at NVIDIA Research. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in 2018, with a focus on performing multiple tasks efficiently with a single deep network. He holds a B.Tech. in Computer Science and Engineering from the Indian Institute of Technology - Kharagpur (2012), and an MS in Computer Science from the University of Illinois at Urbana-Champaign (2014). His interests are in generative modeling and enabling new applications of deep neural networks.

Stefanos Zafeiriou

Imperial College London

Title: Generating and Reconstructing Digital Humans

Abstract: The past few years with the advent of Deep Convolutional Neural Networks (DCNNs), as well as the availability of visual data it was shown that it is possible to produce excellent results in very challenging tasks, such as visual object recognition, detection, tracking, etc. Nevertheless, in certain tasks such as fine-grain object recognition (e.g., face recognition) it is very difficult to collect the amount of data that is needed. In this talk, I will show how, using a special category of Generative Adversarial Networks (GANs), we can generate highly realistic faces and heads and use them for training algorithms such as face and facial expression recognition. Next, I will reverse the problem and demonstrate how by having trained a very powerful face recognition network it can be used to perform very accurate 3D shape and texture reconstruction of faces from a single image. I will further show how to create production-ready human heads, as well as single-shot head-to-head translations using translation networks. Finally, I will touch upon how the generation of 3D human fittings can aid in performing detailed 3D face flow estimation, as well as other tasks as 3D dense human body/hand and pose estimation by capitalizing upon intrinsic mesh convolutions.

Bio: Stefanos Zafeiriou is currently a Professor in Machine Learning and Computer Vision with the Department of Computing, Imperial College London, London, U.K, and an EPSRC Early Career Research Fellow. Between 2016-2020 he was also a Distinguishing Research Fellow with the University of Oulu under Finish Distinguishing Professor Programme. He was a recipient of the Prestigious Junior Research Fellowships from Imperial College London in 2011. He was the recipient of the President’s Medal for Excellence in Research Supervision for 2016. He served Associate Editor and Guest Editor in various journals including IEEE Trans. Pattern Analysis and Machine Intelligence, International Journal of Computer Vision, IEEE Transactions on Affective Computing, Computer Vision and Image Understanding, IEEE Transactions on Cybernetics the Image and Vision Computing Journal. He has been a Guest Editor of 8+ journal special issues and co-organised over 16 workshops/special sessions on specialised computer vision topics in top venues, such as CVPR/FG/ICCV/ECCV (including three very successfully challenges run in ICCV’13, ICCV’15 and CVPR’17 on facial landmark localisation/tracking). He has co-authored 70+ journal papers mainly on novel statistical machine learning methodologies applied to computer vision problems, such as 2-D/3-D face analysis, deformable object fitting and tracking, shape from shading, and human behaviour analysis, published in the most prestigious journals in his field of research, such as the IEEE T-PAMI, the International Journal of Computer Vision, and many papers in top conferences, such as CVPR, ICCV, ECCV, ICML. His students are frequent recipients of very prestigious and highly competitive fellowships, such as the Google Fellowship x2, the Intel Fellowship, and the Qualcomm Fellowship x4. He has more than 12K+ citations to his work, h-index 54. He was the General Chair of BMVC 2017. He was co-founder of two startups Facesoft and Ariel AI which had successful exits.

Oliver Wang

Adobe Research

Title: Correspondences For Video Processing

Abstract: In this talk I will discuss the role of correspondences in video processing. Specifically, I'll cover a few projects related to using 3D reconstruction as a way to find correspondences, targeting restoration applications such as denoising, deblurring, inpainting, view synthesis, and editing. In addition, I will discuss some recent trends of methods that optimize scene reconstructions from video using coordinate-based MLPs.

Bio: Oliver Wang is a Principal Scientist at Adobe Research. He received his PhD in Computer Science in 2010 from the University of California, Santa Cruz, and has worked in a number of research institutions, including HP Labs, Industrial Light and Magic, Max Planck Institut-Informatik, and 6 years at Disney Research in Zurich. During this time, his work has been published in conferences, integrated into the production process of several mainstream movies, and has appeared in production software. His research broadly spans the areas of image and video processing, computer vision, and machine learning.

Join AIM 2021 Zoom meeting for LIVE, talks, Q&A, interaction.
No registration required.

All the accepted AIM workshop papers have also oral presentation.
All the accepted AIM workshop papers are published under the book title "2021 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)" by

Computer Vision Foundation Open Access and IEEE Xplore Digital Library

List of AIM 2021 papers

papers (pdf, suppl. mat) available at https://openaccess.thecvf.com/ICCV2021_workshops/AIM

High Perceptual Quality Image Denoising with a Posterior Sampling CGAN
Guy Ohayon (Technion)*; Theo J Adrai (Technion); Gregory Vaksman (Technion); Michael Elad (Google); Peyman Milanfar (Google)
[poster, slides][video]
Unsupervised Generative Adversarial Networks with Cross-model Weight Transfer Mechanism for Image-to-image Translation
Xuguang Lai (Xi'an Jiaotong University); Xiuxiu Bai (Xi'an Jiaotong University)*; Yongqiang Hao (Xi'an Jiaotong University)
Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement
Xuanchi Ren (HKUST); Tao Yang (Xi'an JiaoTong University); Yuwang Wang (Microsoft Research)*; Wenjun Zeng (Microsoft Research)
SwinIR: Image Restoration Using Swin Transformer
Jingyun Liang (ETH Zurich)*; Jiezhang Cao (ETH Zürich); Guolei Sun (ETH Zurich); Kai Zhang (ETH Zurich); Luc Van Gool (ETH Zurich); Radu Timofte (ETH Zurich)
Test-Time Adaptation for Super-Resolution: You Only Need to Overfit on a Few More Images
Mohammad Saeed Rad (École Polytechnique Fédérale de Lausanne)*; Thomas Yu (École Polytechnique Fédérale de Lausanne); Behzad Bozorgtabar (EPFL); Jean-Philippe Thiran (École Polytechnique Fédérale de Lausanne)
Generalized Real-World Super-Resolution through Adversarial Robustness
Angela Castillo (Universidad de los Andes)*; Maria C Escobar (Universidad de los Andes); Juan C Perez (Universidad de los Andes; King Abdullah University of Science and Technology); Andres Romero (ETH Zürich); Radu Timofte (ETH Zurich); Luc Van Gool (ETH Zurich); Pablo Arbelaez (Universidad de los Andes)
Stochastic Image Denoising by Sampling from the Posterior Distribution
Bahjat Kawar (Technion)*; Gregory Vaksman (Technion); Michael Elad (Technion)
Reducing Noise Pixels and Metric Bias in Semantic Inpainting on Segmentation Map
Jianfeng He (Virginia Tech)*; Bei Xiao (American University); Xuchao Zhang (Virginia Tech); Shuo Lei (Virginia Tech); Shuhui Wang (VIPL,ICT,Chinese academic of science); Chang-Tien Lu (Virginia Tech, USA)
Distilling Reflection Dynamics for Single-Image Reflection Removal
Quanlong Zheng (City University of HongKong)*; Xiaotian Qiao (City University of Hong Kong); Ying Cao (City University of Hong Kong); Shi Guo (The Hong Kong Polytechnic University); Lei Zhang (Hong Kong Polytechnic University, Hong Kong, China); Rynson W.H. Lau (City University of Hong Kong)
[project][poster, slides, video]
SDWNet: A Straight Dilated Network with Wavelet Transformation for image Deblurring
Wenbin Zou (Fujian Normal University)*; MingChao Jiang (JOYY.INC); Yunchen Zhang (China Design Group Ltd.Co); Liang Chen (Fujian Normal University); Zhiyong Lu (JOYY.INC); Yi Wu (Fujian Normal University)
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data
Xintao Wang (Tencent)*; Liangbin Xie (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China); Chao Dong (SIAT); Ying Shan (Tencent)
[project][poster, slides, video]
Manipulating Image Style Transformation via Latent-Space SVM
Qiudan Wang (ViaX online)*
SMILE: Semantically-guided Multi-attribute Image and Layout Editing
Andres Romero (ETH Zürich)*; Luc Van Gool (ETH Zurich); Radu Timofte (ETH Zurich)
[project][poster, slides, video]
Contrastive Feature Loss for Image Prediction
Alex Andonian (MIT)*; Taesung Park (UC Berkeley); Bryan Russell (Adobe Research); Phillip Isola (MIT); Jun-Yan Zhu (Carnegie Mellon University); Richard Zhang (Adobe)
Efficient Wavelet Boost Learning-Based Multi-stage Progressive Refinement Network for Underwater Image Enhancement
Fushuo Huo (Chongqing University)*; bingheng Li (xidian University); xuegui zhu (Chongqing University)
Saliency-Guided Transformer Network combined with Local Embedding for No-Reference Image Quality Assessment
Mengmeng Zhu (University of Electronic Science and Technology of China); Guanqun Hou (Hikvision Research Institute); Xinjia Chen (Hikvision Research Institute); jiaxing xie (hikvision); Haixian Lu (Hikvision Research Institute); jun che (Hikvision Research Institute)*
[poster] [video]
Improving Key Human Features for Pose Transfer
Victor-Andrei Ivan (Arnia Software)*; Ionut Cosmin Mistreanu (Arnia); Andrei S Leica (Arnia); Sung-Jun Yoon (LG Electronics); Manri Cheon (LG Electronics); Junwoo Lee (LG electronics); Jinsoo Oh (LG Electronics)
DeepFake MNIST+: A DeepFake Facial Animation Dataset
Jiajun Huang (The University of Sydney)*; XUEYU WANG (The University of Sydney); Bo Du (Wuhan University); Pei Du (AntGroup); Chang Xu (University of Sydney)
[slides, video][project]
Simple and Efficient Unpaired Real-world Super-Resolution using Image Statistics
Kwangjin Yoon (SI Analytics)*
Sparse to Dense Motion Transfer for Face Image Animation
Ruiqi Zhao (Baidu)*; Tianyi Wu (Baidu); Guodong Guo (Baidu)
Graph2Pix: A Graph-Based Image to Image Translation Framework
Dilara Gokay (Technical University of Munich); Enis Simsar (Technical University of Munich); Efehan Atici (Bogazici University); Alper Ahmetoglu (Bogazici University); Atif Emre Yuksel (Bogazici University); Pinar Yanardag (Bogazici University)*
Underwater Image Color Correction Using Ensemble Colorization Network
Arpit Pipara (DA-IICT)*; OZA URVI ARUNBHAI (DAIICT); Srimanta Mandal (Dhirubhai Ambani Institute of Information and Communication Technology)
A System for Fusing Color and Near-Infrared Images in Radiance Domain
Kim C Ng (OPPO US Research Center)*; Jinglin Shen (OPPO Research US); Chiu Man Ho (OPPO)