Unknown Sniffer for Object Detection: Don't Turn a Blind Eye to Unknown Objects

Wenteng Liang*, Feng Xue*, Yihao Liu, Guofeng Zhong, Anlong Ming (* equal contribution)

Beijing University of Posts and Telecommunications

IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023

Main contributions of UnSniffer

The recently proposed open-world object and open-set detection have achieved a breakthrough in finding never-seen-before objects and distinguishing them from known ones. However, their studies on knowledge transfer from known classes to unknown ones are not deep enough, resulting in the scanty capability for detecting unknowns hidden in the background. In this paper, we propose the unknown sniffer (UnSniffer) to find both unknown and known objects. Firstly, the generalized object confidence (GOC) score is introduced, which only uses known samples for supervision and avoids improper suppression of unknowns in the background. Significantly, such confidence score learned from known objects can be generalized to unknown ones. Additionally, we propose a negative energy suppression loss to further suppress the non-object samples in the background. Next, the best box of each unknown is hard to obtain during inference due to lacking their semantic information in training. To solve this issue, we introduce a graph-based determination scheme to replace hand-designed non-maximum suppression (NMS) post-processing.

Unknown Object Detection Benchmark

In the proposed UOD-Benchmark, we use the Pascal VOC dataset as the training data that contains annotations of 20 object categories. For testing, since the MS-COCO dataset extends the PASCAL VOC categories to 80 object categories, we naturally employ MS-COCO to evaluate unknown objects. However, MS-COCO does not thoroughly label potential unknown objects in images. To address this issue, we propose two datasets, i.e., COCO-OOD and COCO-Mixed, which fully label the unknown objects. Firstly, according to the definition of objects in COCO, i.e. "objects are individual instances that can be easily labelled (person, chair, car)", we hand-pick more than a thousand images that have no area confused with this definition. Secondly, several master students are asked to mark the object regions they got at first glance by drawing polygons, referring to the object definition above. As shown in the figure, we label almost every object in the selected images with fine-grained annotation. Finally, we have two datasets both for testing as follows:

COCO-OOD dataset contains only unknown categories, consisting of 504 images with fine-grained annotations of 1655 unknown objects. All annotations consist of original annotations in COCO and the augmented annotations on the basis of the COCO definition.

COCO-Mixed dataset includes 897 images with annotations of both known and unknown categories. It contains 2533 unknown objects and 2658 known objects, with original COCO annotations used as labels for known objects. Unambiguous unlabeled objects are also annotated. The dataset is more challenging to evaluate due to the images containing more object instances with complex categories and concentrated locations.

Visualization results

More visualization results

@inproceedings{liang2023unknown,
    title={Unknown Sniffer for Object Detection: Don't Turn a Blind Eye to Unknown Objects},
    author={Liang, Wenteng and Xue, Feng and Liu, Yihao and Zhong, Guofeng and Ming, Anlong},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

Powered by ChatGPT