只使用Python的random庫,將已有數據生成HTML格式的標簽雲。思路就是根據同一單詞出現的次數多少,生成不同大小不同顏色單詞的數據的視圖。
比如以下格式的多條數據:
1 Gaming 1 Skateboarding 2 Girl Friend 3 Surfing the Internet 3 TED talks 4 Reading 4 Writing 5 Facebook 5 Gaming 6 Gaming 6 Martial Arts 7 Partying 7 Playing Sport 7 Travel 8 Driving 8 Socializing with Friends 9 Eating 9 Procrastinating 9 Sleeping 10 Winning ……
可制作成效果如下:

首先,將數據存在一個dict里,鍵為單詞,值為出現的個數:
words = ''
for line in data:
word = line.split('\t')[1]
if word not in words:
words[word] = 1
else:
words[word] += 1
然后將制作HTML,將不同單詞設置成隨機的顏色,按單詞出現的頻率設置不同的字體大小。
html = ""
for w, c in words.items():
color = 'rgb(%s, %s, %s)' % (str(random.randint(0, 255)), str(random.randint(0, 255)), str(random.randint(0, 255)))
fontsize = int(c * 0.1 + 10)
html += '<span style=\"font-size:' + str(fontsize) + 'px;color:' + color + ';float:left;\">' + w + '</span>'
# dump it to a file
with open('result.html', 'wb') as f:
f.write(bytes(html, 'UTF-8'))
到這里,已經完成了!
