The Middle Child Problem

Revisiting Parametric Min-cut and Seeds for Object Proposals

Summary

Object proposals have recently fueled the progress in detection performance. These proposals aim to provide category-agnostic localizations for all objects in an image. One way to generate proposals is to perform parametric min-cuts over seed locations. This paper demonstrates that standard parametric-cut models are ineffective in obtaining medium-sized objects, which we refer to as the middle child problem. We propose a new energy minimization framework incorporating geodesic distances between segments which solves this problem. In addition, we introduce a new superpixel merging algorithm which can generate a small set of seeds that reliably cover a large number of objects of all sizes. We call our method POISE— “Proposals for Objects from Improved Seeds and Energies.” POISE enables parametric min-cuts to reach their full potential. On PASCAL VOC it generates ~2,640 segments with an average overlap of 0.81, whereas the closest competing methods require more than 4,200 proposals to reach the same accuracy [LPO – Krahenbuhl and Koltun, 2015, MCG – Pont-Tuset, et al. 2015]. We show detailed quantitative comparisons against 5 state-of-the-art methods on PASCAL VOC and Microsoft COCO segmentation challenges.

Detailed comparisons by Jordi Pont-Tuset and Luc Van Gool.

Publication

Ahmad Humayun, Fuxin Li, and James M. Rehg

The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals

IEEE International Conference on Computer Vision (ICCV), Dec. 2015.

PDF  |  Presentation  |  Poster  |  BibTeX

Code

Download (ver 1.05)   |   Instructions

(Versions before 1.04 are for RIGOR)

Please contact the first author for any issues with the code.

Overview Video

Results

These graphs compare different object proposal methods based on recall against number of proposals at three IoU thresholds. For each segment ground-truth we select the proposal with the highest segmentation IoU. We use this to compute recall, which is the fraction of ground-truths having a corresponding proposal with an IoU score higher than the IoU threshold. The last column shows average recall under the IoU range [0.5, 1]. It is claimed that average recall correlates the best with downstream results on object detection [Hosang et al. 2015]. Note, [Hosang et al. 2015] gives a similar comparison between methods for bounding box IoU. We use linear instead of log scale to highlight that POISE needs far fewer proposals to reach high recall. It can be seen that POISE consistently outperforms the competitors when the number of proposals is more than 700, which is the range of settings most likely to be chosen users of proposal algorithms. Note, the y-scale of each graph is different.

Acknowledgements

This work was supported in part by NSF grant IIS-1320348.

Copyright

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without explicit permission of the copyright holder.