Dataset Structure
All ROS bags in the GOOSE DB are organized in the hierarchical structure of setup
, scenario
and sequence
. This hierarchy allows users to filter the recorded ROS bags according to their needs. The hierarchy levels are defined in the following way:
- A
setup
describes a robotic platform with its sensor suite. The sensor setup remains consistent among allscenarios
within the samesetup
. When the position and combination of sensors are changed on the same robotic platform, then the recordings done from there on should be grouped in a newsetup
. - A
scenario
contains all data recorded in the same measurement run. Global conditions like the weather and the amount of dynamic objects in the scene remain consistent among allsequences
in thescenario
. - A
sequence
corresponds to one ROS bag and includes a YAML file with metadata. The recording length of eachsequence
varies as well as the number of annotated frames in asequence
. Eachsequence
is categorized into onescenario
andsetup
.
The directory tree of the GOOSE dataset could look at follows.
├── setups
├── mucar3
├── metadata.yml
│ ├── scenario01
│ │ ├── metadata.yml
│ │ ├── sequence01
│ │ │ ├── 2adccef9-e281-4a47-9ade-16e49efa4007.bag
│ │ │ └── metadata.yml
│ │ ├── sequence02
│ │ │ ├── 3e1edb48-5f18-48e8-a86d-11351fdb9d0c.bag
│ │ │ └── metadata.yml
│ │ ├── sequence03
│ │ │ ├── b142123d-9c9b-46ce-8c02-5cd407dc9712.bag
│ │ │ └── metadata.yml
│ ├── scenario02
│ │ ├── metadata.yml
│ │ ├── sequence03
│ │ │ ├── 901b5784-8a23-42d7-8dcd-e0fc9c28bd53.bag
│ │ │ └── metadata.yml
In addition, we provide ready-to-use-data for typical deep learning pipelines as described in the following sections.
2D Image Annotations
The directory tree for the image dataset is as follows:
├── goose_label_mapping.csv
├── images
│ ├── test
│ │ ├── 2022-07-07_campus
│ │ ├── ...
│ │ └── 2023-07-03_campus
│ ├── train
│ │ ├── 2022-07-22_touareg_flight
│ │ ├── ...
│ │ └── 2023-05-24_touareg_neubiberg_cloudy
│ └── val
│ ├── 2022-07-22_flight
│ ├── ...
│ └── 2023-05-17_neubiberg_sunny
└── labels
├── test
│ ├── 2022-07-07_campus
│ ├── ...
│ └── 2023-07-03_campus
├── train
│ ├── 2022-07-22_touareg_flight
│ ├── ...
│ └── 2023-05-24_touareg_neubiberg_cloudy
└── val
├── 2022-07-22_flight
├── ...
└── 2023-05-17_neubiberg_sunny
The images
directory contains one image per scene:
<date>_<title>_<framenumber>_<timestamp>_windshield_vis.png (RGB input Image)
In each of the folders in the labels
directory, there are 3 files for each frame, named as follows:
<date>_<title>__<framenumber>_<timestamp>_color.png (RGB Image)
<date>_<title>__<framenumber>_<timestamp>_labelids.png (Class Labels)
<date>_<title>__<framenumber>_<timestamp>_instanceids.png (Instance Labels)
See our experiment section for an example on how to read the data.
3D Point Cloud Annotations
The directory tree for the point cloud dataset is as follows:
├── goose_label_mapping.csv
├── velodyne
│ ├── test
│ │ ├── 2022-07-07_campus
│ │ ├── ...
│ │ └── 2023-07-03_campus
│ ├── train
│ │ ├── 2022-07-22_touareg_flight
│ │ ├── ...
│ │ └── 2023-05-24_touareg_neubiberg_cloudy
│ └── val
│ ├── 2022-07-22_flight
│ ├── ...
│ └── 2023-05-17_neubiberg_sunny
└── labels
├── test
│ ├── 2022-07-07_campus
│ ├── ...
│ └── 2023-07-03_campus
├── train
│ ├── 2022-07-22_touareg_flight
│ ├── ...
│ └── 2023-05-24_touareg_neubiberg_cloudy
└── val
├── 2022-07-22_flight
├── ...
└── 2023-05-17_neubiberg_sunny
The velodyne
directory contains one point cloud per scene:
<date>_<title>__<framenumber>_<timestamp>_vls128.bin (one LiDAR revolution)
In each of the folders in the labels
directory, there is one label file:
<date>_<title>__<framenumber>_<timestamp>_goose.label (semantic and instance annotations)
The 3D point cloud dataset uses the SemanticKITTI format. The point cloud data can be accessed using the following numpy snippet:
import numpy as np
# reading a .bin file
scan = np.fromfile(filename, dtype=np.float32)
scan = scan.reshape((-1, 4))
# put in attribute
points = scan[:, 0:3] # get xyz
remissions = scan[:, 3] # get remission
The data of the .label
files can be read in Python using the following numpy code:
import numpy as np
# reading a .label file
label = np.fromfile(filename, dtype=np.uint32)
label = label.reshape((-1))
# extract the semantic and instance label IDs
sem_label = label & 0xFFFF # semantic label in lower half
inst_label = label >> 16 # instance id in upper half