IMDB-WIKI – 500k+ face images with age and gender labels

Rasmus Rothe, Radu Timofte, Luc Van Gool


DEX: Deep EXpectation of apparent age from a single image

Looking at People Workshop
International Conference on Computer Vision (ICCV), 2015
∗ Winner of LAP challenge on apparent age estimation
∗ NVIDIA ChaLearn LAP 2015 Best Paper Award

In this paper we tackle the estimation of apparent age in still face images with deep learning. Our convolutional neural networks (CNNs) use the VGG-16 architecture and are pretrained on ImageNet for image classification. In addition, due to the limited number of apparent age annotated images, we explore the benefit of finetuning over crawled Internet face images with available age. We crawled 0.5 million images of celebrities from IMDb and Wikipedia that we make public on this website. This is the largest public dataset for age prediction to date. We pose the age regression problem as a deep classification problem followed by a softmax expected value refinement and show improvements over direct regression training of CNNs. Our proposed method, Deep EXpectation (DEX) of apparent age, first detects the face in the test image and then extracts the CNN predictions from an ensemble of 20 networks on the cropped face. The CNNs of DEX were finetuned on the crawled images and then on the provided images with apparent age annotations. DEX does not use explicit facial landmarks. Our DEX is the winner (1st place) of the ChaLearn LAP 2015 challenge on apparent age estimation with more than 115 registered teams, significantly outperforming the human reference.
Our models for age estimation are in use on our website howhot.io which went viral around the internet and was covered extensively in social media and the press (Techcrunch, Hackernews, Reddit #1, Evening Standard, Spiegel).

PDF

Deep expectation of real and apparent age from a single image without facial landmarks

International Journal of Computer Vision (IJCV), 2016

In this paper we propose a deep learning solution to age estimation from a single face image without the use of facial landmarks and introduce the IMDB-WIKI dataset, the largest public dataset of face images with age and gender labels. If the real age estimation research spans over decades, the study of apparent age estimation or the age as perceived by other humans from a face image is a recent endeavor. We tackle both tasks with our convolutional neural networks (CNNs) of VGG-16 architecture which are pre-trained on ImageNet for image classification. We pose the age estimation problem as a deep classification problem followed by a softmax expected value refinement. The key factors of our solution are: deep learned models from large data, robust face alignment, and expected value formulation for age regression. We validate our methods on standard benchmarks and achieve state-of-the-art results for both real and apparent age estimation.

PDF

The IMDB-WIKI dataset

To the best of our knowledge this is the largest publicly available dataset of face images with gender and age labels for training. We provide pretrained models for both age and gender prediction.

Description

Since the publicly available face image datasets are often of small to medium size, rarely exceeding tens of thousands of images, and often without age information we decided to collect a large dataset of celebrities. For this purpose, we took the list of the most popular 100,000 actors as listed on the IMDb website and (automatically) crawled from their profiles date of birth, name, gender and all images related to that person. Additionally we crawled all profile images from pages of people from Wikipedia with the same meta information. We removed the images without timestamp (the date when the photo was taken). Assuming that the images with single faces are likely to show the actor and that the timestamp and date of birth are correct, we were able to assign to each such image the biological (real) age. Of course, we can not vouch for the accuracy of the assigned age information. Besides wrong timestamps, many images are stills from movies - movies that can have extended production times. In total we obtained 460,723 face images from 20,284 celebrities from IMDb and 62,328 from Wikipedia, thus 523,051 in total.

As some of the images (especially from IMDb) contain several people we only use the photos where the second strongest face detection is below a threshold. For the network to be equally discriminative for all ages, we equalize the age distribution for training. For more details please the see the paper.

Usage

For both the IMDb and Wikipedia images we provide a separate .mat file which can be loaded with Matlab containing all the meta information. The format is as follows:

  • dob: date of birth (Matlab serial date number)
  • photo_taken: year when the photo was taken
  • full_path: path to file
  • gender: 0 for female and 1 for male, NaN if unknown
  • name: name of the celebrity
  • face_location: location of the face. To crop the face in Matlab run
    img(face_location(2):face_location(4),face_location(1):face_location(3),:))
  • face_score: detector score (the higher the better). Inf implies that no face was found in the image and the face_location then just returns the entire image
  • second_face_score: detector score of the face with the second highest score. This is useful to ignore images with more than one face. second_face_score is NaN if no second face was detected.
  • celeb_names (IMDB only): list of all celebrity names
  • celeb_id (IMDB only): index of celebrity name
The age of a person can be calculated based on the date of birth and the time when the photo was taken (note that we assume that the photo was taken in the middle of the year):
[age,~]=datevec(datenum(wiki.photo_taken,7,1)-wiki.dob); 
Our code for training was used at the hackathon at the ChaLearn Looking At People Workshop at ICCV 2015. If you are interested click here to get the code.

Download images and metadata

Here you can download the raw images and the metadata. We also provide a version with the cropped faces (with 40% margin). This version is much smaller.

We noticed that some of the images from Wikipedia are broken. We plan to fix this issue in the future. For now please just ignore those images.

IMDB

Download images part 0 (27 GB) md5sum
Download images part 1 (26 GB) md5sum
Download images part 2 (28 GB) md5sum
Download images part 3 (29 GB) md5sum
Download images part 4 (26 GB) md5sum
Download images part 5 (29 GB) md5sum
Download images part 6 (27 GB) md5sum
Download images part 7 (27 GB) md5sum
Download images part 8 (26 GB) md5sum
Download images part 9 (24 GB) md5sum
Download images meta data md5sum
Download faces only (7 GB) md5sum

WIKI

Download images (3 GB) md5sum
Download faces only (1 GB) md5sum

Code for extracting face with margin

This code allows the user to extract the face with a margin. For our pretrained models we used 40% margin of its width and height on all four sides (the default settings). At the top of the script there is sample code for extracting all face images with margin.

Download extractSubImage.m

Download Caffe models

In this section we provide pretrained models for Caffe. For all models we used 40% of margin around the face obtained from the Mathias et. al face detector. For age estimation the output layer has 101 neurons (0-100 years, one for each year). To obtain the predicted age, you need to take the expected value over the softmax-normalized output probabilities. For gender prediction the output layers has 2 neurons (0 for female, 1 for male).

Real age estimation trained on IMDB-WIKI

This model was trained on the IMDB-WIKI dataset. The age distrubtion is equalized and it was used as a pretraining for the ChaLearn apparent age estimation challenge.

Download .caffemodel (0.5 GB)

Download age.prototxt (Testing)
Download age_train.prototxt (Training)

Apparent age estimation trained on LAP dataset

∗ Winner of LAP challenge on apparent age estimation

This model is a fine-tuned version of the previous model. The model was fine-tuned on the dataset of the ChaLearn apparent age estimation challenge. An ensemble of these models led to 1st place at the challenge (115 teams).

Download .caffemodel (0.5 GB)

Download age.prototxt (Testing)
Download age_train.prototxt (Training)

Gender prediction

This model predicts the gender of a person.

Download .caffemodel (0.5 GB)

Download gender.prototxt (Testing)
Download gender_train.prototxt (Training)

Citation

Please add a reference if you are using the dataset or the pretrained models.

@article{Rothe-IJCV-2016,
  author = {Rasmus Rothe and Radu Timofte and Luc Van Gool},
  title = {Deep expectation of real and apparent age from a single image without facial landmarks},
  journal = {International Journal of Computer Vision (IJCV)},
  year = {2016},
  month = {July},
}
@InProceedings{Rothe-ICCVW-2015,
  author = {Rasmus Rothe and Radu Timofte and Luc Van Gool},
  title = {DEX: Deep EXpectation of apparent age from a single image},
  booktitle = {IEEE International Conference on Computer Vision Workshops (ICCVW)},
  year = {2015},
  month = {December},
}

License

Please notice that this dataset is made available for academic research purpose only. All the images are collected from the Internet, and the copyright belongs to the original owners. If any of the images belongs to you and you would like it removed, please kindly inform us, we will remove it from our dataset immediately.
If you are interested in commercial applications for age and gender estimation please contact us.