Dataset Structure
All ROS bags in the GOOSE DB are organized in the hierarchical structure of setup, scenario and sequence. This hierarchy allows users to filter the recorded ROS bags according to their needs. The hierarchy levels are defined in the following way:
- A
setupdescribes a robotic platform with its sensor suite. The sensor setup remains consistent among allscenarioswithin the samesetup. When the position and combination of sensors are changed on the same robotic platform, then the recordings done from there on should be grouped in a newsetup. - A
scenariocontains all data recorded in the same measurement run. Global conditions like the weather and the amount of dynamic objects in the scene remain consistent among allsequencesin thescenario. - A
sequencecorresponds to one ROS bag and includes a YAML file with metadata. The recording length of eachsequencevaries as well as the number of annotated frames in asequence. Eachsequenceis categorized into onescenarioandsetup.
The directory tree of the GOOSE dataset could look at follows.
├── setups
├── mucar3
├── metadata.yml
│ ├── scenario01
│ │ ├── metadata.yml
│ │ ├── sequence01
│ │ │ ├── 2adccef9-e281-4a47-9ade-16e49efa4007.bag
│ │ │ └── metadata.yml
│ │ ├── sequence02
│ │ │ ├── 3e1edb48-5f18-48e8-a86d-11351fdb9d0c.bag
│ │ │ └── metadata.yml
│ │ ├── sequence03
│ │ │ ├── b142123d-9c9b-46ce-8c02-5cd407dc9712.bag
│ │ │ └── metadata.yml
│ ├── scenario02
│ │ ├── metadata.yml
│ │ ├── sequence03
│ │ │ ├── 901b5784-8a23-42d7-8dcd-e0fc9c28bd53.bag
│ │ │ └── metadata.yml
In addition, we provide ready-to-use-data for typical deep learning pipelines as described in the following sections.
2D Image Annotations
The directory tree for the image dataset is as follows:
├── goose_label_mapping.csv
├── images
│ ├── test
│ │ ├── 2022-07-07_campus
│ │ ├── ...
│ │ └── 2023-07-03_campus
│ ├── train
│ │ ├── 2022-07-22_touareg_flight
│ │ ├── ...
│ │ └── 2023-05-24_touareg_neubiberg_cloudy
│ └── val
│ ├── 2022-07-22_flight
│ ├── ...
│ └── 2023-05-17_neubiberg_sunny
└── labels
├── test
│ ├── 2022-07-07_campus
│ ├── ...
│ └── 2023-07-03_campus
├── train
│ ├── 2022-07-22_touareg_flight
│ ├── ...
│ └── 2023-05-24_touareg_neubiberg_cloudy
└── val
├── 2022-07-22_flight
├── ...
└── 2023-05-17_neubiberg_sunny
The images directory contains one image per scene:
<date>_<title>_<framenumber>_<timestamp>_windshield_vis.png (RGB input Image)
In each of the folders in the labels directory, there are 3 files for each frame, named as follows:
<date>_<title>__<framenumber>_<timestamp>_color.png (RGB Image)
<date>_<title>__<framenumber>_<timestamp>_labelids.png (Class Labels)
<date>_<title>__<framenumber>_<timestamp>_instanceids.png (Instance Labels)
See our experiment section for an example on how to read the data.
3D Point Cloud Annotations
The directory tree for the point cloud dataset is as follows:
├── goose_label_mapping.csv
├── velodyne
│ ├── test
│ │ ├── 2022-07-07_campus
│ │ ├── ...
│ │ └── 2023-07-03_campus
│ ├── train
│ │ ├── 2022-07-22_touareg_flight
│ │ ├── ...
│ │ └── 2023-05-24_touareg_neubiberg_cloudy
│ └── val
│ ├── 2022-07-22_flight
│ ├── ...
│ └── 2023-05-17_neubiberg_sunny
└── labels
├── test
│ ├── 2022-07-07_campus
│ ├── ...
│ └── 2023-07-03_campus
├── train
│ ├── 2022-07-22_touareg_flight
│ ├── ...
│ └── 2023-05-24_touareg_neubiberg_cloudy
└── val
├── 2022-07-22_flight
├── ...
└── 2023-05-17_neubiberg_sunny
The velodyne directory contains one point cloud per scene:
<date>_<title>__<framenumber>_<timestamp>_vls128.bin (one LiDAR revolution)
In each of the folders in the labels directory, there is one label file:
<date>_<title>__<framenumber>_<timestamp>_goose.label (semantic and instance annotations)
The 3D point cloud dataset uses the SemanticKITTI format. The point cloud data can be accessed using the following numpy snippet:
import numpy as np
# reading a .bin file
scan = np.fromfile(filename, dtype=np.float32)
scan = scan.reshape((-1, 4))
# put in attribute
points = scan[:, 0:3] # get xyz
remissions = scan[:, 3] # get remission
The data of the .label files can be read in Python using the following numpy code:
import numpy as np
# reading a .label file
label = np.fromfile(filename, dtype=np.uint32)
label = label.reshape((-1))
# extract the semantic and instance label IDs
sem_label = label & 0xFFFF # semantic label in lower half
inst_label = label >> 16 # instance id in upper half