分級聚類算法

本文轉載自查看原文 2016-03-04 17:13 1644 python

　　分級聚類算法以一組對應於原始數據項的聚類開始。函數的主循環部分會嘗試每一組可能的配對並計算他們的相關度，以此來找出最佳配對。最佳配對的兩個聚類會被合並成一個新的聚類。新生成的聚類中所包含的數據，等於將兩個舊聚類的數據求均值之后得到的結果。循環下去，一直到只剩下一個聚類為止。

python實現代碼：

def hcluster(rows,distance=pearson):
  distances={}
  currentclustid=-1

  # Clusters are initially just the rows
  clust=[bicluster(rows[i],id=i) for i in range(len(rows))]

  while len(clust)>1:
    lowestpair=(0,1)
    closest=distance(clust[0].vec,clust[1].vec)
    print "closest",closest
    # loop through every pair looking for the smallest distance
    for i in range(len(clust)):
      for j in range(i+1,len(clust)):
        # distances is the cache of distance calculations
        if (clust[i].id,clust[j].id) not in distances: 
          distances[(clust[i].id,clust[j].id)]=distance(clust[i].vec,clust[j].vec)

        d=distances[(clust[i].id,clust[j].id)]

        if d<closest:
          closest=d
          lowestpair=(i,j)

    # calculate the average of the two clusters
    mergevec=[
    (clust[lowestpair[0]].vec[i]+clust[lowestpair[1]].vec[i])/2.0 
    for i in range(len(clust[0].vec))]

    # create the new cluster
    newcluster=bicluster(mergevec,left=clust[lowestpair[0]],
                         right=clust[lowestpair[1]],
                         distance=closest,id=currentclustid)

    # cluster ids that weren't in the original set are negative
    currentclustid-=1
    del clust[lowestpair[1]]
    del clust[lowestpair[0]]
    clust.append(newcluster)

  return clust[0]

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 聚類算法聚類算法——ISODATA算法聚類算法：ISODATA算法常用的聚類算法及聚類算法評價指標 MapReduce Kmeans聚類算法聚類算法匯總常見的聚類算法有規模限制的聚類算法聚類算法之層次法聚類算法——MCL