Get Data - Active Vision Dataset

Our dataset aims to allow simulation of robotic motion through an environment for object detection. We collect data in many scenes, which may be one or more rooms in a home or office.

Collection Procedure:

For most scenes we conduct a full scan, and then move instances around the scene and collect either a full second scan or a partial second scan. A partial scan has much less images and coverage of the scene than a full scan. A number of object instances from the BigBIRD dataset are placed in every scene. The second scan usually has a different subset of BigBIRD instances than the first scan. We then sample 12 images 30 degrees apart at various locations in the scene. We recommend using our visualization code with the example scene to understand the dataset. More detailed info can be found in our paper, and on the Add Data page.

Objects:

Currently we label 33 unique instances in our scenes. More info about these instances can be found in the instances tab above.

Check out our github for code to visualize and load our data.

When you download our dataset you will receive:

One directory for each scan of each scene holding:

RGB images: Lossy JPG compressed format. 1920x1080 resolution
Depth images: 16 bit PNG format, registered to be the same resolution as the RGB images. They are then losslessly compressed using the optipng tool.
annotations.json: See our format in the next tab. Annotations include 2D bounding boxes and pointers to allow movement through each scene
image_structs.mat: Holds data including reconstructed camera position for each image. Used in some of our visualizations
present_instance_names.txt List of all the instances in the scene, using their names from BigBIRD.
special_note.txt In some scenes an instance may appear twice, but never twice in the same image. These scenes include this file to indicate which instances appear twice. The scenes are Home_006_1, Office_001_1 as of this writing.

Upon request, we may also provide the following:

Sparse/Dense reconstrucions of each scene
3D point cloud labels of each instance in the dense reconstruction
Original resolution(512x424) depth images
Original uncompressed PNG format RGB images

Here we define our format for:

-Bounding boxes
-Annotation files
-Image names

Check out our github for code to visualize and load our data.

bounding box format

Each bounding box has 6 numbers: 
[xmin ymin xmax ymax instance_id difficulty]

1. xmin - minimum x value of bounding box
2. ymin - minimum y value of bounding box
3. xmax - maximum x value of bounding box
4. ymax - maximum y value of bounding box
5. instance_id - numeric id of the instance that is labeled
6. difficulty - a measure of how difficult the box 
                may be for a computer to replicate

*Currently, difficulty is just a measure of the box size, defined below.
 We hope to improve this to account for occlusion in the future. 


if box > (300x100)
    difficulty = 1;
elseif(box  > (200x75)
    difficulty = 2;
elseif box > (100x50)
    difficulty = 3;
elseif box > (50x30)
    difficulty = 4;
else
    difficulty = 5;
end 

**Box's larger dimension must be greater than first number, 
  smaller dimension greater than second number.

Ex)  A box that is 250x80 has difficulty 2.
     A box that is 80x250 has difficulty 2.
     A box that is 250x60 has difficulty 3.
     A box that is 60x250 has difficulty 3.


**We do not gurantee the presence of boxes smaller than 50x30.
 That is, if an object's true box is < 50x30,
 then we may not have labeled that box. Many of these boxes
 are labeled, however, and any box that is labeled contains 
 the object.

Annotation format

Each scene has one JSON file for all annotaions, with following format:

{
    "image_name":{
      "bounding_boxes":[
          [xmin ymin xmax ymax instance_id difficulty],
          [xmin ymin xmax ymax instance_id difficulty],
          ...,
      ],
      "rotate_ccw":"another_image_name",
      "rotate_cw":"another_image_name",
      "forward":"another_image_name",
      "backward":"another_image_name",
      "left":"another_image_name",
      "right":"another_image_name"
  },
  "next_image_name":{ ...

}

Image Name Format

***Each image has a unique name.***
Our image names have 15 digits, followed by the file extension(.jpg or .png).

Example: 000120001620101.jpg 


(Digit 1) = Scene Type
        0 - Home
        1 - Office 


(Digits 2-4) = Scene Number 
        Ex. 001 = scene 1 of this type


(Digit 5) = Scan Number 
        Ex. 2 = scan 2 of this scene

(Digits 6-11) = Image Index 
       Ex. 000162 = the 162nd image captured in this scan 

(digits 12-13) = Camera Index 
       Ex. 01 = this image was taken with Camera 1

(digits 14-15) = Image type 
        01 - RGB 
        02 - raw_depth (512x424)
        03 - high_res_depth (1920x1080)
        05 - improved_depths (see paper, not currently available for download)




So for image 000120001620101.jpg:

Home, Scene 1, Scan 2, Image 162, Camera 1, RGB image

Download Our Data

Here you can download our entire dataset, excluding the few scans we have held our for testing purposes. If you don't want to download everything, checkout our example scene below. An evaluation server will be online in the future. To reduce the download size, we have broken up the dataset into a few .tar files. Check out the instances tab to see examples of our instances, and download some images of them.

Example Scene (1.2GB)

Instead of downloading our entire dataset you can get an idea of what our images look like, our label format, and a general sense of how our data is organized by downloading this single example scan. Check out our github for code to visualize and load our data.
Note: This scan is also included in our full dataset.

Example scan

Description of common instances

We place a subset of our 33 common instances in each scene. When we can, we ask the owner of the home to place the objects in naturual places, to avoid any bias in object placement.
We chose our instance based on instances in the BigBIRD dataset, but not all of our instances are exact matches.
Below we provide:

-A list of all the instances we include, named as they are in BigBIRD, with their id that is used in annotations. (Raw text opens in new tab)
-A download with ~6 images of each instance (not BigBIRD images)
-A view of each instance as it appears in BigBIRD

Instance names/ids
Instance images


advil_liqui_gels	aunt_jemima_origin al_syrup	bumblebee_albacore	cholula_chipotle_h ot_sauce	coca_cola_glass_bo ttle

crest_complete_min ty_fresh	crystal_hot_sauce	expo_marker_red	hersheys_bar	honey_bunches_of_o ats_honey_roasted

honey_bunches_of_o ats_with_almonds	hunts_sauce	listerine_green	mahatma_rice	nature_valley_gran ola_thins_dark_chocolate

nature_valley_swee t_and_salty_nut_ almond	nature_valley_swee t_and_salty_nut_ cashew	nature_valley_swee t_and_salty_nut_ peanut	nature_valley_swee t_and_salty_nut_ roasted_mix_nut	nutrigrain_harvest _blueberry_bliss

paper_plate	pepto_bismol	pringles_bbq	progresso_new_engl and_clam_chowder	quaker_chewy_low_f at_chocolate_chunk

red_bull	red_cup	softsoap_clear	softsoap_gold	softsoap_white

spongebob_squarepa nts_fruit_snaks	tapatio_hot_sauce	vo5_tea_therapy_he althful_green_tea_ smoothing_shampoo

Train 1	Test 1
Home_002_1, Home_003_1, Home_003_2, Home_004_1, Home_004_2, Home_005_1, Home_005_2, Home_006_1, Home_014_1, Home_014_2, Office_001_1	Home_001_1, Home_001_2, Home_008_1
Train 2	Test 2
Home_001_1, Home_001_2, Home_002_1, Home_004_1, Home_004_2, Home_005_1, Home_005_2, Home_006_1, Home_008_1, Home_014_1, Home_014_2,	Home_003_1, Home_003_2, Office_001_1
Train 3	Test 3
Home_001_1, Home_001_2, Home_003_1, Home_003_2, Home_004_1, Home_004_2, Home_005_1, Home_005_2, Home_006_1, Home_008_1, Office_001_1	Home_014_1, Home_014_2, Home_002_1