Looking at People Workshop
International Conference on Computer Vision (ICCV), 2015
∗ Winner of LAP challenge on apparent age estimation
∗ NVIDIA ChaLearn LAP 2015 Best Paper Award
In this paper we tackle the estimation of apparent age in still face images with deep learning. Our convolutional neural networks (CNNs) use the VGG-16 architecture and
are pretrained on ImageNet for image classification. In addition, due to the limited number of apparent age annotated images, we explore the benefit of finetuning over crawled Internet face images with available age. We crawled 0.5 million images of celebrities from IMDb and Wikipedia that we make public on this website. This is the largest public dataset for age prediction to date. We pose the age regression problem as a deep classification problem followed by a softmax expected value refinement and show improvements over direct regression training of CNNs. Our proposed method, Deep EXpectation (DEX) of apparent age, first detects the face in the test image and then extracts the CNN predictions from an ensemble of 20 networks on the cropped face. The CNNs of DEX were finetuned on the crawled images and then on the provided images with apparent age annotations. DEX does not use explicit facial landmarks. Our DEX is the winner (1st place) of the ChaLearn LAP 2015 challenge on apparent age estimation with more than 115 registered teams, significantly outperforming the human reference.
Our models for age estimation are in use on our website howhot.io which went viral around the internet and was covered extensively in social media and the press (Techcrunch, Hackernews, Reddit #1, Evening Standard, Spiegel).
International Journal of Computer Vision (IJCV), 2016
In this paper we propose a deep learning solution to age estimation from a single face image without the use of facial landmarks and introduce the IMDB-WIKI dataset, the largest public dataset of face images with age and gender labels. If the real age estimation research spans over decades, the study of apparent age estimation or the age as perceived by other humans from a face image is a recent endeavor. We tackle both tasks with our convolutional neural networks (CNNs) of VGG-16 architecture which are pre-trained on ImageNet for image classification. We pose the age estimation problem as a deep classification problem followed by a softmax expected value refinement. The key factors of our solution are: deep learned models from large data, robust face alignment, and expected value formulation for age regression. We validate our methods on standard benchmarks and achieve state-of-the-art results for both real and apparent age estimation.
PDFTo the best of our knowledge this is the largest publicly available dataset of face images with gender and age labels for training. We provide pretrained models for both age and gender prediction.
Since the publicly available face image datasets are often of small to medium size, rarely exceeding tens of thousands of images, and often without
age information we decided to collect a large dataset of celebrities. For this purpose, we took the list of the most popular 100,000 actors as listed on the IMDb website and (automatically) crawled from their profiles date of birth, name, gender and all images related to that person.
Additionally we crawled all profile images from pages of people from Wikipedia with the same meta information.
We removed the images without timestamp (the date when the photo was taken).
Assuming that the images with single faces are likely to show the actor and that the timestamp and date of birth are correct, we were able to assign to each such image the biological (real) age. Of course, we can not vouch for the accuracy of the assigned age information. Besides wrong timestamps, many images are stills from movies - movies that can have extended production times. In total we obtained 460,723 face images from 20,284 celebrities from IMDb and 62,328 from Wikipedia, thus 523,051 in total.
As some of the images (especially from IMDb) contain several people we only use the photos where the second strongest face detection is below a threshold. For the network to be equally discriminative for all ages, we equalize the age distribution for training. For more details please the see the paper.
For both the IMDb and Wikipedia images we provide a separate .mat file which can be loaded with Matlab containing all the meta information. The format is as follows:
img(face_location(2):face_location(4),face_location(1):face_location(3),:))
[age,~]=datevec(datenum(wiki.photo_taken,7,1)-wiki.dob);
Here you can download the raw images and the metadata. We also provide a version with the cropped faces (with 40% margin). This version is much smaller.
We noticed that some of the images from Wikipedia are broken. We plan to fix this issue in the future. For now please just ignore those images.
We were informed that some standard tools such as WinZip corrupt the files in the .tar archives. We invite you to use 7-zip to obtain uncorrupted files (especially for the wiki.mat file in wiki_crop.tar).
This code allows the user to extract the face with a margin. For our pretrained models we used 40% margin of its width and height on all four sides (the default settings). At the top of the script there is sample code for extracting all face images with margin.
In this section we provide pretrained models for Caffe. For all models we used 40% of margin around the face obtained from the Mathias et al. face detector. For age estimation the output layer has 101 neurons (0-100 years, one for each year). To obtain the predicted age, you need to take the expected value over the softmax-normalized output probabilities. For gender prediction the output layers has 2 neurons (0 for female, 1 for male).
Note: we used the Imagenet mean when training the models.
This model was trained on the IMDB-WIKI dataset. The age distrubtion is equalized and it was used as a pretraining for the ChaLearn apparent age estimation challenge.
∗ Winner of LAP challenge on apparent age estimation
This model is a fine-tuned version of the previous model. The model was fine-tuned on the dataset of the ChaLearn apparent age estimation challenge. An ensemble of these models led to 1st place at the challenge (115 teams).
This model predicts the gender of a person.
Please add a reference if you are using the dataset or the pretrained models.
@article{Rothe-IJCV-2018, author = {Rasmus Rothe and Radu Timofte and Luc Van Gool}, title = {Deep expectation of real and apparent age from a single image without facial landmarks}, journal = {International Journal of Computer Vision}, volume={126}, number={2-4}, pages={144--157}, year={2018}, publisher={Springer} }
@InProceedings{Rothe-ICCVW-2015, author = {Rasmus Rothe and Radu Timofte and Luc Van Gool}, title = {DEX: Deep EXpectation of apparent age from a single image}, booktitle = {IEEE International Conference on Computer Vision Workshops (ICCVW)}, year = {2015}, month = {December}, }
Please notice that this dataset is made available for academic research purpose only. All the images are collected from the Internet, and the copyright belongs to the original owners. If any of the images belongs to you and you would like it removed, please kindly inform us, we will remove it from our dataset immediately.