Have a look at this video [movie (.mov)] [movie (.wmv)] to see a demonstration of our algorithm.
For real-time depth acquisition, we used the system of Weise et al.
Data set:
Every zip file (each typically about 400 MB) contains the recorded raw data and annotation files from one person. The files are named mX.zip, fX.zip or mXg.zip (indicating male, female, or male person with glasses, X=1:20; see description below).
Single files "frame_XXXXX.input" contain the raw input data (range image), and "frame_XXXXX.groundTruth" the ground truth for the corresponding frame. For access to the dataset, please contact the authors directly.
Description:
10'545 range images of 20 persons (3 of them female, 6 of them additially when wearing glasses).
Each person freely turned her head while the scanner captured range images at 28 fps. In the beginning, each person looks straight into the camera before moving the head.
The resulting range images have a resolution of 640x480 pixels, and a face typically consists of about 150x200 depth values. The head pose range covers about +-90 degrees yaw and +- 45 degrees pitch rotation. Roll rotation is not included in this data set. In the image above, a few example input frames are visible (depth information is shown as green color values), and in the image below, a typical camera trajectory is visualized.
Ground truth:
Contains nose position and forward direction (vector trough nose, i.e. the direction of the face) in a left-handed coordinate system (x right, y up, z into screen).
Files:
Reference implementations for reading in the data, and reading and writing ground truth. No support is provided. Use at your own risk.
Readme: description of data and ground truth format, code snippets: [README.txt]
GroundTruth: read / write ground truth: [GroundTruth.h]
[GroundTruth.cpp]
InputCloud: read input data: [InputCloud.h]
[InputCloud.cpp]
[PeUtils.h]
[PeUtils.cpp]
License, Copyright, Disclaimer:
[license]