深度学习笔记-VGG | DaneSun's Blog

深度学习4

网络结构

通过堆叠多个卷积核也就是多层卷积，来替代大尺度卷积核，从而减少网络中的参数。多个小卷积核与大卷积核具有相同的感受野

感受野

决定某一层输出结果中一个元素所对应的输入层的区域大小，被称为感受野。简单来说就是输出的特征图的一个单元对应输入层区域的大小。

感受野的计算

网络搭建

model

VGG网络搭建与之前的LetNet、AlexNet有所不同，VGG网络有多个不同的结构，且网络较深，若按照之前那样逐个搭建比较耗费时间，因此采用一种较为简便的方法搭建。

# 网络结构
cfgs = {
    'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}

# 生成网络特征提取层结构
def make_features(cfg:list):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == 'M':
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels,v,kernel_size=3,padding=1)
            layers += [conv2d,nn.ReLU(True)]
            in_channels = v
    return nn.Sequential(*layers)

首先定义一个网络结构字典，存放不同的网络结构列表，其中数字x代表输出通道为x的卷积层，字符‘M’代表最大池化层。然后编写一个函数用于生成feature层网络结构，由于VGG网络的卷积层和池化层的其他参数一致，因此只需要遍历配置列表按顺序添加池化层或对应的卷积层。

class VGG(nn.Module):
    def __init__(self,model,class_num,init_weight = False):
        super(VGG, self).__init__()
        features = None
        try:
            features_list = cfgs[model]
            features = make_features(features_list)
        except KeyError:
            print("model {} is not in dict,will creat a vgg16 model".format(model))
            features_list = cfgs["vgg16"]
            features = make_features(features_list)
        self.features = features
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Linear(512*7*7,2048),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(2048,2048),
            nn.ReLU(True),
            nn.Linear(2048,class_num)
        )
        if init_weight:
            self._initialize_weights()

    def forward(self,x):
        x = self.features(x)
        x = torch.flatten(x,start_dim=1)
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                # nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

训练与预测

网络的训练及预测与之前的步骤大致相同，不再赘述。

笔记根据B站UP主霹雳吧啦Wz视频合集【深度学习-图像分类篇章】学习整理