Skip to content
Snippets Groups Projects
Select Git revision
  • dev
  • master default protected
  • v0.5.0
3 results

free_anchor

  • Open with
  • Download source code
  • Download directory
  • Your workspaces

      A workspace is a virtual sandbox environment for your code in GitLab.

      No agents available to create workspaces. Please consult Workspaces documentation for troubleshooting.

  • user avatar
    Wenwei Zhang authored and GitHub committed
    * update build configs
    
    * Update model links
    
    * Update doc
    
    * Update install
    
    * clean workflow
    f842ff96
    History
    Code owners
    Assign users and groups as approvers for specific file changes. Learn more.

    FreeAnchor for 3D Object Detection

    Introduction

    We implement FreeAnchor in 3D detection systems and provide their first results with PointPillars on nuScenes dataset. With the implemented FreeAnchor3DHead, a PointPillar detector with a big backbone (e.g., RegNet-3.2GF) achieves top performance on the nuScenes benchmark.

    @inproceedings{zhang2019freeanchor,
      title   =  {{FreeAnchor}: Learning to Match Anchors for Visual Object Detection},
      author  =  {Zhang, Xiaosong and Wan, Fang and Liu, Chang and Ji, Rongrong and Ye, Qixiang},
      booktitle =  {Neural Information Processing Systems},
      year    =  {2019}
    }

    Usage

    Modify config

    As in the baseline config, we only need to replace the head of an existing one-stage detector to use FreeAnchor head. Since the config is inherit from a common detector head, _delete_=True is necessary to avoid conflicts. The hyperparameters are specifically tuned according to the original paper.

    _base_ = [
        '../_base_/models/hv_pointpillars_fpn_lyft.py',
        '../_base_/datasets/nus-3d.py', '../_base_/schedules/schedule_2x.py',
        '../_base_/default_runtime.py'
    ]
    
    model = dict(
        pts_bbox_head=dict(
            _delete_=True,
            type='FreeAnchor3DHead',
            num_classes=10,
            in_channels=256,
            feat_channels=256,
            use_direction_classifier=True,
            pre_anchor_topk=25,
            bbox_thr=0.5,
            gamma=2.0,
            alpha=0.5,
            anchor_generator=dict(
                type='AlignedAnchor3DRangeGenerator',
                ranges=[[-50, -50, -1.8, 50, 50, -1.8]],
                scales=[1, 2, 4],
                sizes=[
                    [0.8660, 2.5981, 1.],  # 1.5/sqrt(3)
                    [0.5774, 1.7321, 1.],  # 1/sqrt(3)
                    [1., 1., 1.],
                    [0.4, 0.4, 1],
                ],
                custom_values=[0, 0],
                rotations=[0, 1.57],
                reshape_out=True),
            assigner_per_size=False,
            diff_rad_by_sin=True,
            dir_offset=0.7854,  # pi/4
            dir_limit_offset=0,
            bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
            loss_cls=dict(
                type='FocalLoss',
                use_sigmoid=True,
                gamma=2.0,
                alpha=0.25,
                loss_weight=1.0),
            loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=0.8),
            loss_dir=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
    # model training and testing settings
    train_cfg = dict(
        pts=dict(code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.25, 0.25]))

    Results

    PointPillars

    Backbone FreeAnchor Lr schd Mem (GB) Inf time (fps) mAP NDS Download
    FPN 2x 17.1 40.0 53.3 model | log
    FPN 2x 16.2 43.7 55.3 model | log
    RegNetX-400MF-FPN 2x 17.3 44.8 56.4 model | log
    RegNetX-400MF-FPN 2x 17.7 47.9 58.6 model | log
    RegNetX-1.6GF-FPN 2x 24.3 51.2 60.8 model | log
    RegNetX-1.6GF-FPN* 3x 24.3 53.0 62.2 model | log
    RegNetX-3.2GF-FPN 2x 29.5 52.2 62.0 model | log
    RegNetX-3.2GF-FPN* 3x 29.5 55.09 63.5 model | log

    Note: Models noted by * means it is trained using stronger augmentation with vertical flip under bird-eye-view, global translation, and larger range of global rotation.