失眠网 > python cnn识别图像_笨方法学习CNN图像识别（一）—— 图片预处理

python cnn识别图像_笨方法学习CNN图像识别（一）—— 图片预处理

时间：2021-01-06 12:09:35

— 全文阅读5分钟 —

在本文中，你将学习到以下内容：

通过数据增强增加样本量

调整图片大小便于网络训练

前言

图像识别的准备工作就是要对我们拿到手的样本图片进行预处理，具体就是数据增强和调整图片大小，这些准备工作都是为训练网络做准备。图片预处理一定要合理有效，符合机器学习的要求。

数据增强(data augmentation)

当我们拿到一套图片数据准备进行机器学习的时候，样本量往往不够多，因此需要对现有的图片进行数据增强。一方面是为了增加样本量，另一方面能够提高模型的泛化能力。

假设我们有一组商标图片，如下：

商标图片

当我们进行100类的机器学习时，显然这一类的样本量不够多，在这里我们通过keras库进行数据增强。以商标图片中的第一张图片为例：

from keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img

pic_path = r'./3ac79f3df8dcd100755525327e8b4710b8122fdc.jpg'

augmentation_path = r'./data_augmentation'

fromkeras.preprocessing.imageimportImageDataGenerator,img_to_array,load_img

pic_path=r'./3ac79f3df8dcd100755525327e8b4710b8122fdc.jpg'

augmentation_path=r'./data_augmentation'

首先导入keras库，建立图片路径和数据增强保存路径，接下来定义ImageDataGenerator，告诉他通过哪些操作产生新的图片。

data_gen = ImageDataGenerator(

rotation_range=30,

width_shift_range=0.1,

height_shift_range=0.1,

zoom_range=0.2,

fill_mode='nearest')

data_gen=ImageDataGenerator(

rotation_range=30,

width_shift_range=0.1,

height_shift_range=0.1,

zoom_range=0.2,

fill_mode='nearest')

在这里根据当前的图片需求，选择了旋转、平移、缩放、边缘填充的操作，其他操作详见。有些操作的设置要符合实际情况，比如旋转操作，不能把图片完全倒立了，这样的数据增强反而不利机器学习。

img = load_img(pic_path)

x = img_to_array(img)

x = x.reshape((1,) + x.shape)

n = 1

for batch in data_gen.flow(x, batch_size=1, save_to_dir=augmentation_path, save_prefix='train', save_format='jpeg'):

n += 1

if n > 6: # 6表示生成6张新的图片

break

img=load_img(pic_path)

x=img_to_array(img)

x=x.reshape((1,)+x.shape)

n=1

forbatchindata_gen.flow(x,batch_size=1,save_to_dir=augmentation_path,save_prefix='train',save_format='jpeg'):

n+=1

ifn>6:# 6表示生成6张新的图片

break

加载图片的地址，转变成array格式给ImageDataGenerator，save_prefix表示新图片的名字前缀，save_format表示新图片保存的格式。需要注意的是，在这里根据我们定义的操作，从这些操作中随机选择几种生成6张图片。

最终在data_augmentation文件夹中生成6张新的商标图片：

新的商标图片

在实际操作中，应该多去尝试数据增强的各种操作。好的样本扩充能够增加模型的泛化能力，提高准确率。数据增强完整代码如下：

from keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img

pic_path = r'./3ac79f3df8dcd100755525327e8b4710b8122fdc.jpg'

augmentation_path = r'./data_augmentation'

data_gen = ImageDataGenerator(

rotation_range=30,

width_shift_range=0.1,

height_shift_range=0.1,

zoom_range=0.2,

fill_mode='nearest')

img = load_img(pic_path)

x = img_to_array(img)

x = x.reshape((1,) + x.shape)

n = 1

for batch in data_gen.flow(x, batch_size=1, save_to_dir=augmentation_path, save_prefix='train', save_format='jpeg'):

n += 1

if n > 6:

break

fromkeras.preprocessing.imageimportImageDataGenerator,img_to_array,load_img

pic_path=r'./3ac79f3df8dcd100755525327e8b4710b8122fdc.jpg'

augmentation_path=r'./data_augmentation'

data_gen=ImageDataGenerator(

rotation_range=30,

width_shift_range=0.1,

height_shift_range=0.1,

zoom_range=0.2,

fill_mode='nearest')

img=load_img(pic_path)

x=img_to_array(img)

x=x.reshape((1,)+x.shape)

n=1

forbatchindata_gen.flow(x,batch_size=1,save_to_dir=augmentation_path,save_prefix='train',save_format='jpeg'):

n+=1

ifn>6:

break

图片大小调整(resize)

统一调整图片的大小，便于后面进行机器学习。我们以调整data_augmentation文件夹生成的新图片为例：

from PIL import Image

import os

img_path = r'./data_augmentation'

resize_path = r'./resize_image'

for i in os.listdir(img_path):

im = Image.open(os.path.join(img_path,i))

out = im.resize((224, 224))

if not os.path.exists(resize_path):

os.makedirs(resize_path)

out.save(os.path.join(resize_path, i))

fromPILimportImage

importos

img_path=r'./data_augmentation'

resize_path=r'./resize_image'

foriinos.listdir(img_path):

im=Image.open(os.path.join(img_path,i))

out=im.resize((224,224))

ifnotos.path.exists(resize_path):

os.makedirs(resize_path)

out.save(os.path.join(resize_path,i))

使用PIL库改变图片大小，使用os库读取文件路径，将resize后的图片放到resize_image文件夹中。resize后的大小为224*224(这个大小是为了后面ResNet使用)。resize后的图片效果如下：

resize后的图片

当你完成这一步的时候，图像识别的准备工作就完成一半了，剩下的就是将这些图片制成tfrecord格式，方便训练网络读取。

如果觉得《python cnn识别图像_笨方法学习CNN图像识别（一）—— 图片预处理》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。