H²O: Human-to-Human-or-Object Interaction Dataset

H²O is an image dataset annotated for Human-to-Human-or-Object interaction detection.

Dataset description

Images

H²O is composed of the 10 301 images from V-COCO dataset to which are added 3 635 images which mostly contain interactions between people.

Extra-images sources:

MS-COCO dataset
Human-Interaction-Images dataset
BIT-Interaction dataset
SBU-Kinect-Interaction dataset
TV-Human-interaction dataset
Pascal-Voc dataset
ShakeFive2 dataset
UT-Interaction dataset
Web images

Annotations

All H²O images have been annotated with a new taxonomy of verbs including human-to-object and human-to-human interactions.

This taxonomy is composed of 51 verbs divided into 5 categories:

Verbs decribing the general posture of the subject
Verbs related to the way the subject is moving
Verbs used for interactions with object
Verbs describing human-to-human interactions
Verbs of interactions involving strength or violence

Data were annotated with the open-source tool pixano

Dataset download

Please download and unzip the H2O.zip file.

Images

To get images, please download V-COCO images and split it in trainval set and test set as defined in the V-COCO split files. Then rename it as HO[vcoco_id.zfill(10)].jpg.

Then launch the download_HH_images.py script to get the additional images. The script first download the dataset from which are taken extra-images and then copy or download all images in a new directory H2O next to the script.

Annotations

In H²O, each interacting object is annotated even if its category is not in the 80 COCO classes.

We provide 3 annotation folders:
- initial_annotations/ where objects outside the 80 COCO classes are labeled with their real label.
- other_annotations/ where objects outside the 80 COCO classes are grouped under the label "other".
- vcocolike_annotations/ where objects outside the 80 COCO classes are not annotated: the interactions with such objects are thus considered as if they had no target object.
In each of these files, you can find 4 folders:
- trainval and test where the structure is as follows:

{
	"entities":[
		{
			"sourceId":		# Image name
			"category":		# Object category
			"geometry":
			{
				"geometrytype": 1,
				"isNormalized": true,
				"vertices": [xmin, ymin, xmax, ymax]	# Normalized coordinates
			}
			"id":			# A uniq object Id
			"actions":
			[
				{
					"value":		# Interaction verb
					"targetId":		# Target object Id / entity Id if the interaction has no target
					"instrumentId":	# Instrument object Id used to achieve the interaction
									# / entity Id if the interaction has no instrument
				}
			]
		}
	]
}

- `trainval_vcoco` and `test_vcoco` which follows the V-COCO annotation file structure.

Evaluation

To run evaluation and compute the agent AP and role AP, get the V-COCO evaluation code and replace vsrl_eval.py file by h2o_vsrl_eval.py file provided in the H2O.zip file.

This new version allows the evaluation of a list of target objects for a given interaction.

As for the V-COCO evaluation, store your predictions as a pickle file (detections.pkl) in the following format:

[
	{
		'image_id':			# The H²O image name
		'person_box':		# [xmin, ymin, xmax, ymax] the box prediction for the person
		'[action]_agent':  	# The score for action corresponding to the person prediction
							# [action] is a verb from the list provided in `H2O_verb_list.json` file
		'[action]_role':  	# [[x1, y1, x2, y2, s]], list of the predicted boxes for role and 
                      		# associated score for the action-role pair
                      		# [action] is a verb from the list provided in `H2O_verb_list.json` file						
	}
]

To launch the evaluation, run:

from h2o_vsrl_eval import VCOCOeval
vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)

	# For the original scenario
	# vsrl_annot_file:	vcocolike_annotations/test_vcoco/interactions_test.json
	# coco_file: 		vcocolike_annotations/test_vcoco/instances_test.json
	# split_file: 		images_test.ids

	# For the objectness scenario
	# vsrl_annot_file:	other_annotations/test_vcoco/interactions_test.json
	# coco_file: 		other_annotations/test_vcoco/instances_test.json
	# split_file: 		images_trainval.ids

vcocoeval._do_eval(detections.pkl, ovr_thresh=0.5)

License

Data annotations are under Creative Commons Attribution Non Commercial 4.0 license (see LICENSE file in H2O.zip file).

Evaluation code is under MIT license.

Citation

A. Orcesi, R. Audigier, F. Poka Toukam and B. Luvison, "Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO," 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), 2021, pp. 1-8, doi: 10.1109/FG52635.2021.9667005

https://arxiv.org/abs/2201.02396

@INPROCEEDINGS{h2o_dataset_2021,
	author={Orcesi, Astrid and Audigier, Romaric and Poka Toukam, Fritz and Luvison, Bertrand},
	booktitle={2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)},
	title={Detecting Human-to-Human-or-Object (H<sup>2</sup>O) Interactions with DIABOLO},
	year={2021},
	month={dec},
	pages={1-8},
	doi={10.1109/FG52635.2021.9667005},
	url={https://doi.org/10.1109%2Ffg52635.2021.9667005}}

Contact

If you have any question about this dataset, you can contact us by email at: h2o@cea.fr