Scene Labeling


The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).

Dataset Preview:


Stixmantics scene labeling results with intermediate steps:


ANNOUNCEMENT: Please stay tuned for our upcoming Cityscapes Dataset with more than 5000 annotated frames.


Daimler Urban Segmentation Dataset 2014 (NEW!!)

 ground truth labels
 annotated
 images (left camera)
 annotated / all
 images (right camera)
 annotated / all
 disparity maps
 annotated / all
 camera files
 annotated / all
 vehicle data
 annotated / all
 development kit download

IMPORTANT NOTICE: as of April 2015, we changed the evaluation protocol! Numbers should be reported using the PASCAL VOC intersection-over-union measure, with unlabeled background ignored. Our previous publications have reported these number by taking unlabeled pixels into account. As this evaluation procedure lead to several misunderstandings, we now follow the exact PASCAL definition. 
Note that to obtain exact comparable numbers, the cyclist (12) and bicycle (5) labels in our dataset should both be mapped to the pedestrian (2) label during evaluation.

The following table contains the new scores when following our new evaluation protocol. We will keep this website updated with methods that report numbers on our dataset. If you are using our dataset and wish to have your method listed here as well, please send me an email with your inferred label results. I will verify the numbers and update the website accordingly.

  Ground Vehicle Pedestrian Sky Building Average Avg. Dyn. (*) Runtime (**)
 Stixmantics [2] 93.8 78.8 66.0 75.4 89.2 80.6 72.40.05 s 
 ALE [3] 94.9 76.0 73.1 95.5 90.6 86.0 74.5111 s 
 Darwin pairwise [4] 95.7 68.7 21.2 94.2 87.6 73.5 44.9N/A 
 PN-RCPN [5] 96.7 79.4 68.4 91.4 86.3 84.5 73.82.8 s 
 Layered Interpretation [6]
 96.483.3
71.1
89.5
91.2
86.3
77.2
0.11 s

(*) Avg. Dyn. denotes the average of Vehicle and Pedestrian performance
(**) Runtime is reported per image


[1] T. Scharwächter, M. Enzweiler, S. Roth, and U. Franke. "Efficient Multi-Cue Scene Segmentation", In Proc. of the German Conference on Pattern Recognition (GCPR), 2013. (GCPR Main Prize) [ Publisher Link - Download Preprint PDF ]

[2] T. Scharwächter, M. Enzweiler, S. Roth, and U. Franke. "Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding", European Conference on Computer Vision (ECCV), 2014 Publisher Link - Download Preprint PDF ]

[3] L. Ladický, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, and P. H. S. Torr. "Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction", British Machine Vision Conference (BMVC) 2010

[4] 
S. Gould, "DARWIN: A Framework for Machine Learning and Computer Vision Research and Development", JMLR 2012

[5] A. SharmaO. TuzelD. W. Jacobs, "Deep Hierarchical Parsing for Semantic Segmentation", Computer Vision and Pattern Recognition (CVPR) 2015

[6] M.
Liu, S. Lin, S. Ramalingam, O. Tuzel, "Layered Interpretation of Street View Images", Robotics Science and System (RSS) 2015

If you use this dataset in your work, please cite [1] or [2].



Daimler Urban Segmentation Dataset 2013 (OBSOLETE)


 dataset link: download here (left input image, labels, disparity map)


 
Supplementary data:

corresponding right input images: download here

  (note, that the left and right images are already rectified
  w.r.t. the extrinsic calibration of the stereo setup, to allow
  direct horizontal stereo matching)

- tilt corrected camera files for each frame: download here

- intermediate frames: see new 2014 dataset above



If you have any question about the dataset, please contact Timo Scharwächter.

License agreement:

This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use, copy, and distribute the data given that you agree:

1. That the dataset comes "AS IS", without express or implied warranty. Although every effort has been made to ensure accuracy, Daimler does not accept any responsibility for errors or omissions.
2. That you include a reference to the above publication in any published work that makes use of the dataset.
3. That if you have altered the content of the dataset or created derivative work, prominent notices are made so that any recipients know that they are not receiving the original data.
4. That you may not use or distribute the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
5. That this original license notice is retained with all copies or derivatives of the dataset.
6. That all rights not expressly granted to you are reserved by Daimler.

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.


ċ
2014_devkit.zip
(5k)
Timo Scharwächter,
Aug 3, 2015, 7:29 AM
Ċ
Timo Scharwächter,
Nov 2, 2014, 2:56 PM
Ċ
Timo Scharwächter,
Sep 16, 2013, 9:07 AM
ċ
stf_reference_code.zip
(388k)
Timo Scharwächter,
Apr 28, 2016, 10:32 PM