失眠网 > 【记录】 Keras 使用官方模型

【记录】 Keras 使用官方模型

时间：2022-11-07 22:06:38

相关推荐

【记录】 Keras 使用官方模型

官方链接：https://keras.io/zh/applications/#applications

文章目录

可用的模型在 ImageNet 上预训练过的用于图像分类的模型：图像分类模型的使用示例使用 ResNet50 进行 ImageNet 分类使用 VGG16 提取特征从VGG19 的任意中间层中抽取特征在新类上微调 InceptionV3通过自定义输入张量构建 InceptionV3 模型概览XceptionVGG16VGG19ResNetInceptionV3InceptionResNetV2MobileNetDenseNetNASNetMobileNetV2

Keras 的应用模块（keras.applications）提供了带有预训练权值的深度学习模型，这些模型可以用来进行预测、特征提取和微调（fine-tuning）。

当你初始化一个预训练模型时，会自动下载权重到 ~/.keras/models/ 目录下。（windows放在C:\Users\xxx.keras\models\）

代码自动下载时会很慢，可以自己在网上下载模型并放置在目录下。

可用的模型

在 ImageNet 上预训练过的用于图像分类的模型：

XceptionVGG16VGG19ResNet, ResNetV2, ResNeXtInceptionV3InceptionResNetV2MobileNetMobileNetV2DenseNetNASNet

所有的这些架构都兼容所有的后端 (TensorFlow, Theano 和 CNTK)，并且会在实例化时，根据 Keras 配置文件〜/.keras/keras.json中设置的图像数据格式构建模型。举个例子，如果你设置image_data_format=channels_last，则加载的模型将按照 TensorFlow 的维度顺序来构造，即「高度-宽度-深度」（Height-Width-Depth）的顺序。

注意：

对于Keras < 2.2.0，Xception 模型仅适用于 TensorFlow，因为它依赖于SeparableConvolution层。对于Keras < 2.1.5，MobileNet 模型仅适用于 TensorFlow，因为它依赖于DepthwiseConvolution层。

图像分类模型的使用示例

使用 ResNet50 进行 ImageNet 分类

from keras.applications.resnet50 import ResNet50from keras.preprocessing import imagefrom keras.applications.resnet50 import preprocess_input, decode_predictionsimport numpy as npmodel = ResNet50(weights='imagenet')img_path = 'elephant.jpg'img = image.load_img(img_path, target_size=(224, 224))x = image.img_to_array(img)x = np.expand_dims(x, axis=0)x = preprocess_input(x)preds = model.predict(x)# 将结果解码为元组列表 (class, description, probability)# (一个列表代表批次中的一个样本）print('Predicted:', decode_predictions(preds, top=3)[0])# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

使用 VGG16 提取特征

from keras.applications.vgg16 import VGG16from keras.preprocessing import imagefrom keras.applications.vgg16 import preprocess_inputimport numpy as npmodel = VGG16(weights='imagenet', include_top=False)img_path = 'elephant.jpg'img = image.load_img(img_path, target_size=(224, 224))x = image.img_to_array(img)x = np.expand_dims(x, axis=0)x = preprocess_input(x)features = model.predict(x)

从VGG19 的任意中间层中抽取特征

from keras.applications.vgg19 import VGG19from keras.preprocessing import imagefrom keras.applications.vgg19 import preprocess_inputfrom keras.models import Modelimport numpy as npbase_model = VGG19(weights='imagenet')model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)img_path = 'elephant.jpg'img = image.load_img(img_path, target_size=(224, 224))x = image.img_to_array(img)x = np.expand_dims(x, axis=0)x = preprocess_input(x)block4_pool_features = model.predict(x)

在新类上微调 InceptionV3

from keras.applications.inception_v3 import InceptionV3from keras.preprocessing import imagefrom keras.models import Modelfrom keras.layers import Dense, GlobalAveragePooling2Dfrom keras import backend as K# 构建不带分类器的预训练模型base_model = InceptionV3(weights='imagenet', include_top=False)# 添加全局平均池化层x = base_model.outputx = GlobalAveragePooling2D()(x)# 添加一个全连接层x = Dense(1024, activation='relu')(x)# 添加一个分类器，假设我们有200个类predictions = Dense(200, activation='softmax')(x)# 构建我们需要训练的完整模型model = Model(inputs=base_model.input, outputs=predictions)# 首先，我们只训练顶部的几层（随机初始化的层）# 锁住所有 InceptionV3 的卷积层for layer in base_model.layers:layer.trainable = False# 编译模型（一定要在锁层以后操作）pile(optimizer='rmsprop', loss='categorical_crossentropy')# 在新的数据集上训练几代model.fit_generator(...)# 现在顶层应该训练好了，让我们开始微调 Inception V3 的卷积层。# 我们会锁住底下的几层，然后训练其余的顶层。# 让我们看看每一层的名字和层号，看看我们应该锁多少层呢：for i, layer in enumerate(base_model.layers):print(i, layer.name)# 我们选择训练最上面的两个 Inception block# 也就是说锁住前面249层，然后放开之后的层。for layer in model.layers[:249]:layer.trainable = Falsefor layer in model.layers[249:]:layer.trainable = True# 我们需要重新编译模型，才能使上面的修改生效# 让我们设置一个很低的学习率，使用 SGD 来微调from keras.optimizers import pile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')# 我们继续训练模型，这次我们训练最后两个 Inception block# 和两个全连接层model.fit_generator(...)

通过自定义输入张量构建 InceptionV3

from keras.applications.inception_v3 import InceptionV3from keras.layers import Input# 这也可能是不同的 Keras 模型或层的输出input_tensor = Input(shape=(224, 224, 3)) # 假定 K.image_data_format() == 'channels_last'model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)

模型概览

Top-1 准确率和 Top-5 准确率都是在 ImageNet 验证集上的结果。

Depth 表示网络的拓扑深度。这包括激活层，批标准化层等。

Xception

keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

在 ImageNet 上预训练的 Xception V1 模型。

在 ImageNet 上，该模型取得了验证集 top1 0.790 和 top5 0.945 的准确率。

注意该模型只支持channels_last的维度顺序（高度、宽度、通道）。

模型默认输入尺寸是 299x299。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效（否则输入形状必须是(299, 299, 3)，因为预训练模型是以这个大小训练的）。它必须拥有 3 个输入通道，且宽高必须不小于 71。例如(150, 150, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个 4D 张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个 2D 张量。'max'代表全局最大池化。classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

VGG16

keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

VGG16 模型，权值由 ImageNet 训练而来。

该模型可同时构建于channels_first(通道，高度，宽度) 和channels_last（高度，宽度，通道）两种输入维度顺序。

模型默认输入尺寸是 224x224。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(244, 244, 3)（对于channels_last数据格式），或者(3, 244, 244)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

VGG19

keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

VGG19 模型，权值由 ImageNet 训练而来。

该模型可同时构建于channels_first(通道，高度，宽度) 和channels_last（高度，宽度，通道）两种输入维度顺序。

模型默认输入尺寸是 224x224。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(244, 244, 3)（对于channels_last数据格式），或者(3, 244, 244)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

ResNet

keras.applications.resnet.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnet.ResNet101(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnet.ResNet152(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnet_v2.ResNet50V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnet_v2.ResNet101V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnet_v2.ResNet152V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnext.ResNeXt50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.resnext.ResNeXt101(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

ResNet, ResNetV2, ResNeXt 模型，权值由 ImageNet 训练而来。

该模型可同时构建于channels_first(通道，高度，宽度) 和channels_last（高度，宽度，通道）两种输入维度顺序。

模型默认输入尺寸是 224x224。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(244, 244, 3)（对于channels_last数据格式），或者(3, 244, 244)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

InceptionV3

keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Inception V3 模型，权值由 ImageNet 训练而来。

该模型可同时构建于channels_first(通道，高度，宽度) 和channels_last（高度，宽度，通道）两种输入维度顺序。

模型默认输入尺寸是 299x299。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(299, 299, 3)（对于channels_last数据格式），或者(3, 299, 299)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(150, 150, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

InceptionResNetV2

keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Inception-ResNet V2 模型，权值由 ImageNet 训练而来。

该模型可同时构建于channels_first(通道，高度，宽度) 和channels_last（高度，宽度，通道）两种输入维度顺序。

模型默认输入尺寸是 299x299。

参数

include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(299, 299, 3)（对于channels_last数据格式），或者(3, 299, 299)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(150, 150, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

MobileNet

keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

在 ImageNet 上预训练的 MobileNet 模型。

注意，该模型目前只支持channels_last的维度顺序（高度、宽度、通道）。

模型默认输入尺寸是 224x224。

参数

input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(224, 224, 3)（对于channels_last数据格式），或者(3, 224, 224)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。alpha: 控制网络的宽度：如果alpha< 1.0，则同比例减少每层的滤波器个数。如果alpha> 1.0，则同比例增加每层的滤波器个数。如果alpha= 1，使用论文默认的滤波器个数depth_multiplier: depthwise卷积的深度乘子，也称为（分辨率乘子）dropout: dropout 概率include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

DenseNet

keras.applications.densenet.DenseNet121(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.densenet.DenseNet169(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)keras.applications.densenet.DenseNet201(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

在 ImageNet 上预训练的 DenseNet 模型。

注意，该模型目前只支持channels_last的维度顺序（高度、宽度、通道）。

模型默认输入尺寸是 224x224。

参数

blocks: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(224, 224, 3)（对于channels_last数据格式），或者(3, 224, 224)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(224, 224, 3)（对于channels_last数据格式），或者(3, 224, 224)（对于channels_first数据格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

NASNet

keras.applications.nasnet.NASNetLarge(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)keras.applications.nasnet.NASNetMobile(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

在 ImageNet 上预训练的神经结构搜索网络模型（NASNet）。

NASNetLarge 模型默认的输入尺寸是 331x331，NASNetMobile 模型默认的输入尺寸是 224x224。

参数

input_shape: 可选，输入尺寸元组，仅当include_top=False时有效，否则输入形状必须是(224, 224, 3)（对于channels_last数据格式），或者(3, 224, 224)（对于channels_first数据格式），对于 NASNetLarge 来说，输入形状必须是(331, 331, 3)（channels_last格式）或(3, 331, 331)（channels_first格式）。它必须拥有 3 个输入通道，且宽高必须不小于 32。例如(200, 200, 3)是一个合法的输入尺寸。include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。

返回值

一个 KerasModel对象。

MobileNetV2

keras.applications.mobilenet_v2.MobileNetV2(input_shape=None, alpha=1.0, depth_multiplier=1, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

在 ImageNet 上预训练的 MobileNetV2 模型。

请注意，该模型仅支持'channels_last'数据格式（高度，宽度，通道)。

模型默认输出尺寸为 224x224。

参数

input_shape: optional shape tuple, to be specified if you would like to use a model with an input img resolution that is not (224, 224, 3). It should have exactly 3 inputs channels (224, 224, 3). You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g.(160, 160, 3)would be one valid value.alpha: 控制网络的宽度。这在 MobileNetV2 论文中被称作宽度乘子。如果 alpha < 1.0，则同比例减少每层的滤波器个数。如果 alpha > 1.0，则同比例增加每层的滤波器个数。如果 alpha = 1，使用论文默认的滤波器个数。depth_multiplier: depthwise 卷积的深度乘子，也称为（分辨率乘子）include_top: 是否包括顶层的全连接层。weights:None代表随机初始化，'imagenet'代表加载在 ImageNet 上预训练的权值。input_tensor: 可选，Keras tensor 作为模型的输入（即layers.Input()输出的 tensor）。pooling: 可选，当include_top为False时，该参数指定了特征提取时的池化方式。None代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。'avg'代表全局平均池化（GlobalAveragePooling2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。'max'代表全局最大池化classes: 可选，图片分类的类别数，仅当include_top为True并且不加载预训练权值时可用。