$$\Large资深入门者关于PyTorch和深度学习的碎碎念——第一个实践网络CenterPoint$$

历史记录

清除记录

猜你想搜

AcWing热点
App
登录/注册

$$\Large资深入门者关于PyTorch和深度学习的碎碎念——第一个实践网络CenterPoint$$

作者：

Okamasa店老板 , 2023-02-09 03:10:59 , 所有人可见 , 阅读 341

6

4

<<< $\color{blue}{ (●’◡’●) 点赞 <(- ●’◡’●) }$

    /\\       //>>>> 
   /__\\     //        关注加RP，AC更容易！
  /    \\   //>>>>>

<<< $\color{blue}{ (●’◡’●) 收藏 <(- ●’◡’●) }$

直接开整点云开源项目。

点云简介

点云就是3D点的集合，一般包含三个属性: x, y, z，表示点在三维空间中的坐标。
根据不同的雷达类型，每个点还有线束和水平角信息。
如果有多个不同的雷达，每个点还有雷达ID的属性。
点云本质还是x, y, z点在三维空间中的位置信息。

CenterPoint简介

这个模型的主要贡献，是使用center-base的方法表示点云。即，模型的输出不再是box，而是中心点+各点到中心点的偏移。我们可以从开源代码开始学习。项目地址👉 CenterPoint

跑通的重要提示！！！

代码已经是几年前的了，工具版本日新月异，强烈建议搞个docker安装

模型

网络结构

具体各层的类型在configs中定义。本文按configs/nusc/pp/nusc_centerpoint_pp_02voxel_two_pfn_10sweep.py的网络结构进行拆解。相关配置如下:

model = dict(
    type="PointPillars",
    pretrained=None,
    reader=dict(
        type="PillarFeatureNet",
        num_filters=[64, 64],
        num_input_features=5,
        with_distance=False,
        voxel_size=(0.2, 0.2, 8),
        pc_range=(-51.2, -51.2, -5.0, 51.2, 51.2, 3.0),
    ),
    backbone=dict(type="PointPillarsScatter", ds_factor=1),
    neck=dict(
        type="RPN",
        layer_nums=[3, 5, 5],
        ds_layer_strides=[2, 2, 2],
        ds_num_filters=[64, 128, 256],
        us_layer_strides=[0.5, 1, 2],
        us_num_filters=[128, 128, 128],
        num_input_features=64,
        logger=logging.getLogger("RPN"),
    ),
    bbox_head=dict(
        # type='RPNHead',
        type="CenterHead",
        in_channels=sum([128, 128, 128]),
        tasks=tasks,
        dataset='nuscenes',
        weight=0.25,
        code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2, 1.0, 1.0],
        common_heads={'reg': (2, 2), 'height': (1, 2), 'dim':(3, 2), 'rot':(2, 2), 'vel': (2, 2)}, # (output_channel, num_conv)
    ),
)

模型是按配置顺序建立的。连接顺序为PillarFeatureNet -> PointPillarsScatter -> RPN -> CenterHead

PillarFeatureNet

PillarFeatureNet负责生成伪2D图像。

对这个网络而言，输入是 9 * PillarNum * N 的Tensor；9维分别是(x,y,z,intensity,x_c,y_c,z_c,x_p,y_p);
x, y, z指点云的坐标点
intensity, 点云反射强度；这一维不一定有
x_c, y_c, z_c 到pillar中心的距离
x_p, y_p 点到pillar所在grid的偏移量

经过金字塔特征网络可以得到C维的向量。
官方版本的金字塔网络是这样一种结构：

非lastlayer:

(in_channel, P, N) -> (out_channel / 2, P, N)
                        |   \-> (out_channel / 2, P, MaxOn(N))
                        |               |
                        \ concat on ch  /

lastlayer：直接输出N维度的最大值

中间层通过concat N维度上的最大值，保持输入输出channel不变；
最后一层直接输出(C, PillarNum，N)中N所在维度的最大值，最终得到(C, PillarNum)张量。

PointPillarScatter

这个模块负责将Pillar映射回HW的图像，实际执行的是一个坐标转换，把原来Pillar的坐标转到HW上。

RPN

转换为图像后，这里变成传统的下采样->上采样->拼接；完成特征提取。

CenterHead

源码对每个任务设置了单独的head进行推理。

for num_cls in num_classes:
            heads = copy.deepcopy(common_heads)
            if not dcn_head:
                heads.update(dict(hm=(num_cls, num_hm_conv)))
                self.tasks.append(
                    SepHead(share_conv_channel, heads, bn=True, init_bias=init_bias, final_kernel=3)
                )
            else:
                self.tasks.append(
                    DCNSepHead(share_conv_channel, num_cls, heads, bn=True, init_bias=init_bias, final_kernel=3)
                )
tasks = [
    dict(num_class=1, class_names=["car"]),
    dict(num_class=2, class_names=["truck", "construction_vehicle"]),
    dict(num_class=2, class_names=["bus", "trailer"]),
    dict(num_class=1, class_names=["barrier"]),
    dict(num_class=2, class_names=["motorcycle", "bicycle"]),
    dict(num_class=2, class_names=["pedestrian", "traffic_cone"]),
]

2 评论

aaaCode-ljh 2023-02-09 08:44

orz

lukehan 2023-02-09 08:13

为什么你的标题可以出界 hhh

App 内打开