失眠网 > python实现K-means算法

python实现K-means算法

时间：2020-12-02 23:04:49

相关推荐

python实现K-means算法

K-means算法流程:

随机选k个样本作为初始聚类中心计算数据集中每个样本到k个聚类中心距离，并将其分配到距离最小的聚类中心对于每个聚类，重新计算中心回到2，至得到局部最优解

python代码：

import randomimport numpy as npimport matplotlib.pyplot as pltplt.ion()#开启交互，matplotlib默认阻塞模式，直到调用plt.show()才会显示def getDistance(point1,point2): #求距离return ((point1[0]-point2[0])**2+(point1[1]-point2[1])**2)**0.5def cluster(): #根据中心聚类distance=np.zeros((N,k))for i in range(N):minimum=9999for j in range(k):distance[i,j]=getDistance(point[i],centers[j])for j in range(k):if distance[i,j]<minimum:minimum=distance[i,j]center[i]=centers[j]def getE(): #求误差平方和sum_=0for i in range(k):for j in range(N):if np.all(center[j]==centers[i]):sum_+=getDistance(point[j],centers[i])**2return sum_def getNewCenters():#获得新的中心点for i in range(k):count=0temp_x=0temp_y=0for j in range(N):if np.all(center[j]==centers[i]):count+=1temp_x+=point[j,0]temp_y+=point[j,1]temp_x/=count;temp_y/=count;centers[i]=np.array([temp_x,temp_y])def show(): #展示for i in range(k):for j in range(N):if np.all(center[j]==centers[i]):plt.scatter(point[j,0],point[j,1],c=cnames[i],s=10)plt.scatter(centers[:,0],centers[:,1],c='black',s=50)k=3 #聚类中心个数N=100 #数据集个数cnames=['red','yellow','blue','chocolate','darkcyan','darksalmon','red','pink','yellow']center=np.zeros((N,2)) #各数据分配的中心point=np.random.rand(N,2) #数据集中的样本index=np.random.choice(N,k,replace=False) centers=point[index[:]]#随机抽取K个作为聚类中心cluster()show()t1=0t=getE()while t-t1:t1=tgetNewCenters()cluster()t=getE()plt.pause(0.2)plt.clf()show()plt.ioff()

代码效果：

如果觉得《python实现K-means算法》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。