失眠网 > CNN图片手势识别

CNN图片手势识别

时间：2022-08-13 13:23:43

相关推荐

CNN图片手势识别

（本人最初发布于HackMD：HackMD链接，链接内容为繁体字）

1.问题描述

使用CNN，识别如下三种不同的手势：

提供的数据集：

共提供五个数据集，其中Set1、Set2、Set3为训练集，Set4和Set5位测试集。

每个数据集中包含0000-0008九个文件夹，0000-0002，0003-0005，0006-0008分别为手势一、手势二和手势三。每个文件夹中有0000-0019共二十张32x32像素的灰阶图。

2.数据预处理

①定义得到训练集和测试集所有图片的path和label的function

首先要得到数据，通过读取五个数据集内的图片，先统一获取到测试集和训练集，稍后再将训练集和测试集分开。

# 得到训练集和测试集的所有图片path和labelimport cv2import numpy as npimport osdef enumerate_files(base_path = 'C:/Users/NOTEBOOK/DeepLearningNote/All_gray_1_32_32'):filenames,labels = [],[]for file1 in os.listdir(base_path):#得到根目录的下一级目录，内有“Set1”...for file2 in os.listdir(base_path+'/'+file1):#得到set1到set5的0000-0008for file3 in os.listdir(base_path+'/'+file1+'/'+file2):#得到所有的0000-0019for file4 in os.listdir(base_path+'/'+file1+'/'+file2+'/'+file3):#得到所有的图片filenames += [base_path+'/'+file1+'/'+file2+'/'+file3+'/'+file4]#将三种手势分别对应0,1,2if(file2 in ['0000','0001','0002']):labels += [0]if(file2 in ['0003','0004','0005']):labels += [1]if(file2 in ['0006','0007','0008']):labels += [2]return filenames,labels

②定义将图片转换为神经网络可以处理的array的function

需要注意的是，在我的这种将图片转换为array的时候，img_to_array方法产生的结果是一个三通道的彩色图片结果，由于我们数据为灰阶图，所以三个通道的数值会是一样的，在此我只保留RGB通道的index=0的通道。

#将enumerate_files方法获取到得图片的path，转换为arrayfrom keras.preprocessing.image import img_to_array, load_imgdef read_images(images):imgs = []for image in images:img = load_img(image)img = img_to_array(img)img = img[:,:,0]/255.0#由于img_to_array方法得到的通道数是3，是彩色图，所以最后会有三个通道，我们只取三个中的一个（三个颜色的通道内容相同）imgs.append(img)return imgs

③利用前面两个function，得到images和labels，并且划分为train_data,test_data

原始的数据集中，训练集为Set1、Set2、Set3，测试集为Set4和Set5，所以我们按照顺序得到的所有图片前540张为训练集，540之后为测试集。

#使用enumerate_files(),read_images()将图片转换为array，以及每张图片对应的label标签from keras.utils import to_categoricalfilenames,labels = enumerate_files()images = read_images(filenames)images = np.array(images)#这个时候，images.shape = 900,32,32images = images.reshape((900,32,32,1))labels = to_categorical(labels)train_images = images[:540]train_labels = labels[:540]test_images = images[540:]test_labels = labels[540:]print(test_labels.shape)

3.建立模型

因为要做两组判断，一组用数据增强，一组不用数据增强，在此我们使用一个function来构造model

#建立构造model的function#建立模型的functionfrom keras import models,layersdef createModel():model = models.Sequential()model.add(layers.Conv2D(32,(2,2),activation='relu',input_shape=(32,32,1)))model.add(layers.MaxPooling2D((2,2)))model.add(layers.Conv2D(64,(2,2),activation='relu'))model.add(layers.MaxPooling2D((2,2)))model.add(layers.Conv2D(128,(2,2),activation='relu'))model.add(layers.MaxPooling2D((2,2)))model.add(layers.Conv2D(128,(2,2),activation='relu'))model.add(layers.MaxPooling2D((2,2)))model.add(layers.Flatten())# model.add(layers.Dropout(0.3))model.add(layers.Dense(512,activation='relu'))model.add(layers.Dense(3,activation='softmax'))pile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])return model

4.训练模型

①打乱数据集顺序

由于原始数据集是存在一定的规律顺序，先是手势1，然后手势2，然后手势3，所以在训练模型之前，我们要先将train_data的顺序打乱

# 简单分类之前要先打乱一下shuffle_index= np.arange(train_images.shape[0])np.random.shuffle(shuffle_index)train_images = train_images[shuffle_index]train_labels = train_labels[shuffle_index]print(shuffle_index)

②划分出验证集

得到了打乱的训练集后，再划分出验证集validation_data，在本测试中，选取了validation的长度为100

#划分出validation集和train集validation_images = train_images[:100]validation_labels = train_labels[:100]images = train_images[100:]labels = train_labels[100:]

③不使用数据增强来训练模型

model = createModel()history = model.fit(images,labels,epochs=40,batch_size=32,validation_data=(validation_images,validation_labels))

④使用数据增强来训练模型

#使用数据增强from keras.preprocessing.image import ImageDataGenerator,img_to_arraydatagen = ImageDataGenerator(rotation_range=40,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.2,zoom_range=0.2,horizontal_flip=True,fill_mode='nearest')model1 = createModel()history1 = model1.fit_generator(datagen.flow(images,labels,batch_size=32),steps_per_epoch=len(train_images)/32,epochs = 40,validation_data=(validation_images,validation_labels))

5.打印训练过程

“蓝色”为不使用数据增强的训练过程图

“红色”为使用数据增强的训练过程图

import matplotlib.pyplot as pltacc = histroy.history["accuracy"]loss = histroy.history['loss']epochs = range(1,len(acc)+1)val_acc = histroy.history["val_accuracy"]val_loss = histroy.history["val_loss"]plt.plot(epochs,acc,'bo',label="Training acc")plt.plot(epochs,val_acc,'b',label="Validation acc")plt.legend()plt.show()

plt.plot(epochs,loss,'bo',label="Training loss")plt.plot(epochs,val_loss,'b',label="Validation loss")plt.legend()plt.show()

import matplotlib.pyplot as pltacc = history1.history["accuracy"]loss = history1.history['loss']epochs = range(1,len(acc)+1)val_acc = history1.history["val_accuracy"]val_loss = history1.history["val_loss"]plt.plot(epochs,acc,'ro',label="Training acc")plt.plot(epochs,val_acc,'r',label="Validation acc")plt.legend()plt.show()

plt.plot(epochs,loss,'ro',label="Training loss")plt.plot(epochs,val_loss,'r',label="Validation loss")plt.legend()plt.show()