We provide unified benchmarks and code for new evaluation metrics for Open-set Object Detection (OSOD) such as presented in the paper Open-set object detection: towards unified problem formulation and benchmarking (ECCV Workshop 2024)
For this benchmark, we use the Pascal-VOC
and MS-COCO
datasets.
The list of the different splits of the proposed VOC-COCO benchmarks are provided in Google Drive
.
Images are extracted from the OpenImages dataset
.
Such as demonstrated by BigDetection
, the initial OpenImages dataset presents several problems such as overlapping annotations or redundant class representations. Hence, we use the annotations of BigDetection to refine the original OpenImages annotations.
Moreover, to create a smaller and more specific benchmark close to real-life applications, we introduce the OpenImagesRoad dataset which is a subset of OpenImages containing only road images. To this end, we select every image that contains at least one vehicle or street sign object and do not contain any indoor object (under super-classes home appliance, plumbing fixture, office supplies, kitchenware, furniture, bathroom accessory, drink, food, cosmetics, personal care, medical equipment, musical instrument and computer electronics).
We split each super-class into two distinct sets: the most frequent 50% classes having at least 60 instances are considered as known classes, and the rest (less frequent 50% classes or having less than 60 examples) are unknown. This resulted into 50 known classes, and 113 unknown class all grouped under the label unknown such as presented in the table below. Note that we deleted all classes corresponding to object parts such as clothing items, vehicle parts (e.g.wheel, license_plate) or building parts (e.g. door, window).
Super-class | Known-classes | Unknown-classes |
---|---|---|
land_vehicle | car_(automobile), bicycle, motorcycle, train_(railroad_vehicle), truck, bus_(vehicle), minivan, taxi |
army_tank, wheelchair, golfcart, segway, ambulance, limousine, cart, snowmobile, unicycle |
boat | boat, canoe, gondola_(boat) | barge, water_scooter, submarine |
airplane | airplane | helicopter, space_shuttle |
person | person | |
tree | tree, palm_tree, potted_plant | christmas_tree, maple, willow |
flower_arrangement | flower_arrangement, rose | sunflower, skullcap, lavender, lily |
street_sign | streetlight, traffic_light, billboard | street_sign, stop_sign, parking_meter |
building | building, house, tower | lighthouse, castle, tree_house |
sports_equipment | paddle, surfboard, parachute, skateboard, bal |
ski, stationary_bicycle, bowling_ball, scoreboard, baseball_bat, snowboard, soccer_ball, tennis_ball, football_(american), table_tennis_racket, tennis_racket, racket |
toy | toy, balloon | teddy_bear, doll, kite,frisbee |
bird | bird, goose, duck | chicken_(animal), eagle, parrot, turkey, butterfly, penguin, canary, owl, crow, ostrich, dragonfly, sparrow |
animal | dog, cat, horse, crow | tiger, lion, jaguar, raccoon, otter, fox, giant_panda, polar_bear, bear, mule, camel, elephant, sheep, goat, deer, monkey, gazelle, giraffe, alpaca, squirrel, hog, zebra, kangaroo, hamster, rhinoceros, rabbit, dinosaur, turtle, lizard, frog, snake, insect, spider, snail |
fish | fish | harbor_seal, whale, dolphin,shark, seahorse, goldfish, starfish, crab_(animal), oyster |
sculpture | sculpture | snowman |
container | flowerpot, trash_can | barrel, can |
tool | ladder, camera, snowplow | tripod, handsaw, drill, flashlight, measuring_stick |
weapon | missile, cannon | gun, sword |
clock | clock | alarm_clock, digital_clock |
others | flag, tent | fountain, swimming_pool, fireplug |
The different annotations used for this benchmark are in Google Drive
, where:
The lists of the different splits of the proposed OpenImagesRoad benchmark are provided in Google Drive
.
OW-DETR
We provide the different evaluation files that could be used within the OW-DETR base code to evaluate the OpenImagesRoad benchmark in Google Drive
such as presented in the paper.
open_wold_eval.py: the original code for evaluating usual metrics: AP_known, AP_unknown, WI, A-OSE and U-Recall. The only changes to take into consideration are the new OpenImagesRoad config and classes. This code can be used for D_test_ID, D_test_OOD and D_test_all using the annotations in OpenImagesRoad/annotations/D_test_all.zip
open_world_eval_all.py: the code to compute AP_all. We change all predictions to one class "unknown" and we use the annotations in OpenImagesRoad/annotations/D_test_all_unknown.zip where every box is considered as unknown. This code can be used for D_test_OOD and D_test_all.
open_world_eval_sc.py: the code to compute AP_sc on D_tes,OOD. We change the different classes predicted to the super-classes and we use the annotations in OpenImagesRoad/annotations/D_test_ood_sc.zip where classes of D_test_OOD are set the higher hierarchy level (super-class).
Evaluation code and data annotation are under Apache 2.0 license as found in the LICENSE file.
@proceedings{ammar2024opensetobjectdetectionunified,
title={Open-set object detection: towards unified problem formulation and benchmarking},
author={Hejer Ammar and Nikita Kiselov and Guillaume Lapouge and Romaric Audigier},
year={2024},
booktitle = {Proceedings of the European conference on computer vision (ECCV) Workshops},
url={https://arxiv.org/abs/2411.05564},
}
Should you have any question, please contact :e-mail: osod-benchmarks@cea.fr