[Doc] Add the Lyft dataset tutorial (#849)

* Update nuscenes_det.md * Create lyft_det.md * Update index.rst * Fix typos * Create lyft_det.md * Update index.rst * Update nuscenes_det.md

[Doc] Add the Lyft dataset tutorial (#849)
17f33361 · Tai-Wang · GitHub · a8b744b5 · 17f33361 · 17f33361
Unverified Commit 17f33361 authored 3 years ago by Tai-Wang Committed by GitHub 3 years ago
--- a/docs/datasets/index.rst
+++ b/docs/datasets/index.rst
@@ -2,6 +2,7 @@
   :maxdepth: 2

   nuscenes_det.md
+   lyft_det.md
   waymo_det.md
   sunrgbd_det.md
   scannet_det.md

--- a/docs/datasets/lyft_det.md
+++ b/docs/datasets/lyft_det.md
+# Lyft Dataset for 3D Object Detection
+
+This page provides specific tutorials about the usage of MMDetection3D for Lyft dataset.
+
+## Before Preparation
+
+You can download Lyft 3D detection data [HERE](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/data) and unzip all zip files.
+
+Like the general way to prepare a dataset, it is recommended to symlink the dataset root to `$MMDETECTION3D/data`.
+
+The folder structure should be organized as follows before our processing.
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── lyft
+│   │   ├── v1.01-train
+│   │   │   ├── v1.01-train (train_data)
+│   │   │   ├── lidar (train_lidar)
+│   │   │   ├── images (train_images)
+│   │   │   ├── maps (train_maps)
+│   │   ├── v1.01-test
+│   │   │   ├── v1.01-test (test_data)
+│   │   │   ├── lidar (test_lidar)
+│   │   │   ├── images (test_images)
+│   │   │   ├── maps (test_maps)
+│   │   ├── train.txt
+│   │   ├── val.txt
+│   │   ├── test.txt
+│   │   ├── sample_submission.csv
+```
+
+Here `v1.01-train` and `v1.01-test` contain the metafiles which are similar to those of nuScenes. `.txt` files contain the data split information.
+Lyft does not have an official split for training and validation set, so we provide a split considering the number of objects from different categories in different scenes.
+`sample_submission.csv` is the base file for submission on the Kaggle evaluation server.
+Note that we follow the original folder names for clear organization. Please rename the raw folders as shown above.
+
+## Dataset Preparation
+
+The way to organize Lyft dataset is similar to nuScenes. We also generate the .pkl and .json files which share almost the same structure.
+Next, we will mainly focus on the difference between these two datasets. For a more detailed explanation of the info structure, please refer to [nuScenes tutorial](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/datasets/nuscenes_det.md).
+
+To prepare info files for Lyft, run the following commands:
+
+```bash
+python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft --extra-tag lyft --version v1.01
+python tools/data_converter/lyft_data_fixer.py --version v1.01 --root-folder ./data/lyft
+```
+
+Note that the second command serves the purpose of fixing a corrupted lidar data file. Please refer to the discussion [here](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000) for more details.
+
+The folder structure after processing should be as below.
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── lyft
+│   │   ├── v1.01-train
+│   │   │   ├── v1.01-train (train_data)
+│   │   │   ├── lidar (train_lidar)
+│   │   │   ├── images (train_images)
+│   │   │   ├── maps (train_maps)
+│   │   ├── v1.01-test
+│   │   │   ├── v1.01-test (test_data)
+│   │   │   ├── lidar (test_lidar)
+│   │   │   ├── images (test_images)
+│   │   │   ├── maps (test_maps)
+│   │   ├── train.txt
+│   │   ├── val.txt
+│   │   ├── test.txt
+│   │   ├── sample_submission.csv
+│   │   ├── lyft_infos_train.pkl
+│   │   ├── lyft_infos_val.pkl
+│   │   ├── lyft_infos_test.pkl
+│   │   ├── lyft_infos_train_mono3d.coco.json
+│   │   ├── lyft_infos_val_mono3d.coco.json
+│   │   ├── lyft_infos_test_mono3d.coco.json
+```
+
+Here, .pkl files are generally used for methods involving point clouds, and coco-style .json files are more suitable for image-based methods, such as image-based 2D and 3D detection.
+Different from nuScenes, we only support using the json files for 2D detection experiments. Image-based 3D detection may be further supported in the future.
+
+Next, we will elaborate on the difference compared to nuScenes in terms of the details recorded in these info files.
+
+- without `lyft_database/xxxxx.bin`: This folder and `.bin` files are not extracted on the Lyft dataset due to the negligible effect of ground-truth sampling in the experiments.
+- `lyft_infos_train.pkl`: training dataset infos, each frame info has two keys: `metadata` and `infos`.
+`metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.01-train'}`, while `infos` contains the detailed information the same as nuScenes except for the following details:
+    - info['sweeps']: Sweeps information.
+        - info['sweeps'][i]['type']: The sweep data type, e.g., `'lidar'`.
+          Lyft has different LiDAR settings for some samples, but we always take only the points collected by the top LiDAR for the consistency of data distribution.
+    - info['gt_names']: There are 9 categories on the Lyft dataset, and the imbalance of annotations for different categories is even more significant than nuScenes.
+    - without info['gt_velocity']: There is no velocity measurement on Lyft.
+    - info['num_lidar_pts']: Set to -1 by default.
+    - info['num_radar_pts']: Set to 0 by default.
+    - without info['valid_flag']: This flag does recorded due to invalid `num_lidar_pts` and `num_radar_pts`.
+- `nuscenes_infos_train_mono3d.coco.json`: training dataset coco-style info. This file only contains 2D information, without the information required by 3D detection, such as camera intrinsics.
+    - info['images']: A list containing all the image info.
+        - only containing `'file_name'`, `'id'`, `'width'`, `'height'`.
+    - info['annotations']: A list containing all the annotation info.
+        - only containing `'file_name'`, `'image_id'`, `'area'`, `'category_name'`, `'category_id'`, `'bbox'`, `'is_crowd'`, `'segmentation'`, `'id'`, where `'is_crowd'`, `'segmentation'` are set to `0` and `[]` by default.
+        There is no attribute annotation on Lyft.
+
+Here we only explain the data recorded in the training info files. The same applies to the testing set.
+
+The core function to get `lyft_infos_xxx.pkl` is [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/lyft_converter.py#L91).
+Please refer to [lyft_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/lyft_converter.py) for more details.
+
+## Training pipeline
+
+### LiDAR-Based Methods
+
+A typical training pipeline of LiDAR-based 3D detection (including multi-modality methods) on Lyft is almost the same as nuScenes as below.
+
+```python
+train_pipeline = [
+    dict(
+        type='LoadPointsFromFile',
+        coord_type='LIDAR',
+        load_dim=5,
+        use_dim=5,
+        file_client_args=file_client_args),
+    dict(
+        type='LoadPointsFromMultiSweeps',
+        sweeps_num=10,
+        file_client_args=file_client_args),
+    dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
+    dict(
+        type='GlobalRotScaleTrans',
+        rot_range=[-0.3925, 0.3925],
+        scale_ratio_range=[0.95, 1.05],
+        translation_std=[0, 0, 0]),
+    dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
+    dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
+    dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
+    dict(type='PointShuffle'),
+    dict(type='DefaultFormatBundle3D', class_names=class_names),
+    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
+]
+```
+
+Similar to nuScenes, models on Lyft also need the `'LoadPointsFromMultiSweeps'` pipeline to load point clouds from consecutive frames.
+In addition, considering the intensity of LiDAR points collected by Lyft is invalid, we also set the `use_dim` in `'LoadPointsFromMultiSweeps'` to `[0, 1, 2, 4]` by default,
+where the first 3 dimensions refer to point coordinates, and the last refers to timestamp differences.
+
+## Evaluation
+
+An example to evaluate PointPillars with 8 GPUs with Lyft metrics is as follows.
+
+```shell
+bash ./tools/dist_test.sh configs/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py checkpoints/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818-fc6904c3.pth 8 --eval bbox
+```
+
+## Metrics
+
+Lyft proposes a more strict metric for evaluating the predicted 3D bounding boxes.
+The basic criteria to judge whether a predicted box is positive or not is the same as KITTI, i.e. the 3D Intersection over Union (IoU).
+However, it adopts a way similar to COCO to compute the mean average precision (mAP) -- compute the average precision under different thresholds of 3D IoU from 0.5-0.95.
+Actually, overlap more than 0.7 3D IoU is a quite strict criterion for 3D detection methods, so the overall performance seems a little low.
+The imbalance of annotations for different categories is another important reason for the finally lower results compared to other datasets.
+Please refer to its [official website](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/overview/evaluation) for more details about the definition of this metric.
+
+We employ this official method for evaluation on Lyft. An example of printed evaluation results is as follows:
+
+```
+mAPs@0.5:0.95------+--------------+
+| class             | mAP@0.5:0.95 |
+-------------------+--------------+
+| animal            | 0.0          |
+| bicycle           | 0.099        |
+| bus               | 0.177        |
+| car               | 0.422        |
+| emergency_vehicle | 0.0          |
+| motorcycle        | 0.049        |
+| other_vehicle     | 0.359        |
+| pedestrian        | 0.066        |
+| truck             | 0.176        |
+| Overall           | 0.15         |
+-------------------+--------------+
+```
+
+## Testing and make a submission
+
+An example to test PointPillars on Lyft with 8 GPUs and generate a submission to the leaderboard is as follows.
+
+```shell
+./tools/dist_test.sh configs/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py work_dirs/pp-lyft/latest.pth 8 --out work_dirs/pp-lyft/results_challenge.pkl --format-only --eval-options 'jsonfile_prefix=work_dirs/pp-lyft/results_challenge' 'csv_savepath=results/pp-lyft/results_challenge.csv'
+```
+
+After generating the `work_dirs/pp-lyft/results_challenge.csv`, you can submit it to the Kaggle evaluation server. Please refer to the [offical website](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles) for more information.
+
+We can also visualize the prediction results with our developed visualization tools. Please refer to the [visualization doc](https://mmdetection3d.readthedocs.io/en/latest/useful_tools.html#visualization) for more details.
--- a/docs/datasets/nuscenes_det.md
+++ b/docs/datasets/nuscenes_det.md
@@ -33,7 +33,7 @@ To prepare these files for nuScenes, run the following command:
 python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
 ```

-The folder structure after processing should be as below
+The folder structure after processing should be as below.

 ```
 mmdetection3d
@@ -49,12 +49,10 @@ mmdetection3d
 |   |   ├── v1.0-trainval
 │   │   ├── nuscenes_database
 │   │   ├── nuscenes_infos_train.pkl
-│   │   ├── nuscenes_infos_trainval.pkl
 │   │   ├── nuscenes_infos_val.pkl
 │   │   ├── nuscenes_infos_test.pkl
 │   │   ├── nuscenes_dbinfos_train.pkl
 │   │   ├── nuscenes_infos_train_mono3d.coco.json
-│   │   ├── nuscenes_infos_trainval_mono3d.coco.json
 │   │   ├── nuscenes_infos_val_mono3d.coco.json
 │   │   ├── nuscenes_infos_test_mono3d.coco.json
 ```
@@ -63,7 +61,7 @@ Here, .pkl files are generally used for methods involving point clouds and coco-
 Next, we will elaborate on the details recorded in these info files.

 - `nuscenes_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset
- `nuscenes_infos_train.pkl`: training dataset infos, each frame info has two keys: `metadata` and `infos`.
+- `nuscenes_infos_train.pkl`: training dataset info, each frame info has two keys: `metadata` and `infos`.
 `metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.0-trainval'}`, while `infos` contains the detailed information as follows:
    - info['lidar_path']: The file path of the lidar point cloud data.
    - info['token']: Sample data token.
@@ -91,9 +89,9 @@ Next, we will elaborate on the details recorded in these info files.
    - info['num_lidar_pts']: Number of lidar points included in each 3D bounding box.
    - info['num_radar_pts']: Number of radar points included in each 3D bounding box.
    - info['valid_flag']: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
- `nuscenes_infos_train_mono3d.coco.json`: training dataset coco-style infos. This file organizes image-based data into three categories (keys): `'categories'`, `'images'`, `'annotations'`.
+- `nuscenes_infos_train_mono3d.coco.json`: training dataset coco-style info. This file organizes image-based data into three categories (keys): `'categories'`, `'images'`, `'annotations'`.
    - info['categories']: A list containing all the category names. Each element follows the dictionary format and consists of two keys: `'id'` and `'name'`.
-    - info['images']: A list containing all the image infos.
+    - info['images']: A list containing all the image info.
        - info['images'][i]['file_name']: The file name of the i-th image.
        - info['images'][i]['id']: Sample data token of the i-th image.
        - info['images'][i]['token']: Sample token corresponding to this frame.
@@ -104,7 +102,7 @@ Next, we will elaborate on the details recorded in these info files.
        - info['images'][i]['cam_intrinsic']: Camera intrinsic matrix. (3x3 list)
        - info['images'][i]['width']: Image width, 1600 by default in nuScenes.
        - info['images'][i]['height']: Image height, 900 by default in nuScenes.
-    - info['annotations']: A list containing all the annotation infos.
+    - info['annotations']: A list containing all the annotation info.
        - info['annotations'][i]['file_name']: The file name of the corresponding image.
        - info['annotations'][i]['image_id']: The image id (token) of the corresponding image.
        - info['annotations'][i]['area']: Area of the 2D bounding box.
@@ -203,7 +201,7 @@ Currently we do not support more augmentation methods, because how to transfer a

 ## Evaluation

-An example to evaluate PointPillars with 8 GPUs with nuScenes metrics is as follows
+An example to evaluate PointPillars with 8 GPUs with nuScenes metrics is as follows.

 ```shell
 bash ./tools/dist_test.sh configs/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py checkpoints/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth 8 --eval bbox
@@ -243,10 +241,10 @@ barrier 0.466   0.581   0.269   0.169   nan     nan

 ## Testing and make a submission

-An example to test PointPillars on kitti with 8 GPUs and generate a submission to the leaderboard is as follows
+An example to test PointPillars on nuScenes with 8 GPUs and generate a submission to the leaderboard is as follows.

 ```shell
-./tools/dist_test.sh configs/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py work_dirs/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/latest.pth 8 --out work_dirs/pp-nus/results_eval.pkl --format-only --eval-options 'jsonfile_prefix=work_dirs/pp-nus/results_eval'
+./tools/dist_test.sh configs/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py work_dirs/pp-nus/latest.pth 8 --out work_dirs/pp-nus/results_eval.pkl --format-only --eval-options 'jsonfile_prefix=work_dirs/pp-nus/results_eval'
 ```

 Note that the testing info should be changed to that for testing set instead of validation set [here](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/_base_/datasets/nus-3d.py#L132).

--- a/docs_zh-CN/datasets/index.rst
+++ b/docs_zh-CN/datasets/index.rst
@@ -2,6 +2,7 @@
   :maxdepth: 2

   nuscenes_det.md
+   lyft_det.md
   waymo_det.md
   sunrgbd_det.md
   scannet_det.md

--- a/docs_zh-CN/datasets/lyft_det.md
+++ b/docs_zh-CN/datasets/lyft_det.md
+# 3D 目标检测 Lyft 数据集
--- a/docs_zh-CN/datasets/nuscenes_det.md
+++ b/docs_zh-CN/datasets/nuscenes_det.md