Depth and Appearance for Mobile Scene Analysis

This page hosts the datasets used in our ICCV 2007 publication: Andreas Ess, Bastian Leibe, and Luc van Gool, "Depth and Appearance for Mobile Scene Analysis" (pdf).

Have a look at this video to see a demonstration of our system.

We provide the three datasets used for testing our system for our ICCV 2007 publication, including annotations. Data was recorded using a pair of AVT Marlins mounted on a chariot, with a resolution of 640 x 480 (bayered), and a framerate of 13--14 FPS. For each dataset, we provide the unbayered images for both cameras, the camera calibration, as well as the set of annotations. Depth maps were created based on this data using the publicy available belief-propagation-based stereo algorithm of Huttenlocher and Felzenszwalb (note: this has no occlusion handling built in, if you know of a better, publicly available stereo algorithm, please contact me).
The annotation files available here contain a total of 12'298 annotated pedestrians. Please note that for our testing environment, we chose to only use the subset of annotations with a height greater than 50 pixels (= 10'958). We deeply appreciate the help of Martin Vogt in annotating this large amount of data.

Please reference our paper when using the data:

author = {A. Ess and B. Leibe and L. Van Gool},

title = {Depth and Appearance for Mobile Scene Analysis},

booktitle = {International Conference on Computer Vision (ICCV'07)},

year = {2007},

month = {October},

keywords = {}



New: Most of the sequences can be found on the dataset page.

Sequence #0 (450 frames) - Images (left, 250 MB) Images (right, 250 MB) Annotations Calibration (Training) (images NOT undistorted)
Sequence #3 (354 frames) - Images (left, 170 MB) Images (right, 170 MB) Annotations Calibration Result (images already undistorted)


Calibration files contain the calibration for both left and right camera (K [3x3], rad [1x2] tan [1x2] R [3x3] t [1x3]), with K the internal calibration, rad/tan the radial/tangential distortion coefficients, and R/t external calibration, world -> camera (i.e. X_cam = R X_world + t).

The cameras are installed about 950 mm above ground.

IDL files

An IDL file is used for storing the annotations of the sequence. For each image, it lists a set of bounding boxes, separated by commas. The boxes contain upper-left and lower-right corner, but are not necessarily sorted according to this. A semicolon ends the list of bounding boxes for a single file, a period ends the file.
"filename": (x1, y1, x2, y2), (x1, y1, x2, y2), ...;

A simple MATLAB reader is available: readIDL.m