只使用Python的random庫,將已有數據生成HTML格式的標簽雲。思路就是根據同一單詞出現的次數多少,生成不同大小不同顏色單詞的數據的視圖。
比如以下格式的多條數據:
1 Gaming 1 Skateboarding 2 Girl Friend 3 Surfing the Internet 3 TED talks 4 Reading 4 Writing 5 Facebook 5 Gaming 6 Gaming 6 Martial Arts 7 Partying 7 Playing Sport 7 Travel 8 Driving 8 Socializing with Friends 9 Eating 9 Procrastinating 9 Sleeping 10 Winning ……
可制作成效果如下:
首先,將數據存在一個dict里,鍵為單詞,值為出現的個數:
words = '' for line in data: word = line.split('\t')[1] if word not in words: words[word] = 1 else: words[word] += 1
然后將制作HTML,將不同單詞設置成隨機的顏色,按單詞出現的頻率設置不同的字體大小。
html = "" for w, c in words.items(): color = 'rgb(%s, %s, %s)' % (str(random.randint(0, 255)), str(random.randint(0, 255)), str(random.randint(0, 255))) fontsize = int(c * 0.1 + 10) html += '<span style=\"font-size:' + str(fontsize) + 'px;color:' + color + ';float:left;\">' + w + '</span>' # dump it to a file with open('result.html', 'wb') as f: f.write(bytes(html, 'UTF-8'))
到這里,已經完成了!