基於gensim的LDA主題模型實現一鍵式函數打包

本文轉載自查看原文 2019-01-19 15:04 639 gensim/ LDA

def genlda(textlist,n):
    ticks = str(time.time()).replace('.','')[-6:-1]
    nn=str(n)
    dictionary = corpora.Dictionary(textlist)
    corpus = [ dictionary.doc2bow(text) for text in textlist ]

    #tfidf = models.TfidfModel(corpus)
    #corpus_tfidf = tfidf[corpus]
    #print(list(corpus_tfidf))#輸出詞的tfidf
    #print(list(corpus))#輸出文本向量空間
    #########Run the LDA model for XX topics ###############################

    lda =LdaMulticore(corpus=corpus, id2word=dictionary, num_topics=n,passes=100,workers=3) 
    doc_topic = [a for a in lda[corpus]]

    ####### write the topics in file topics_result.txt ##############
    topics_r = lda.print_topics(num_topics = n, num_words =20)
    topic_name = codecs.open('詞匯矩陣主題個數'+nn+'時間'+ticks+'.txt','w')
    for v in topics_r:
        topic_name.write(str(v)+'\n')
    lda.save('模型主題個數'+nn+'時間'+ticks)
    print('主題數',nn,ticks,lda.log_perplexity(corpus))
    f=open('每篇分類主題個數'+nn+'時間'+ticks+'.txt','a+')
    k=0
    for i in lda.get_document_topics(corpus)[:]:
        listj=[]
        for j in i:
            listj.append(j[1])
        bz=listj.index(max(listj))
        print(k,i[bz][0],i[bz][1],listj,listj.index(max(listj)),file=f)
        k=k+1

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Gensim LDA主題模型實驗文本主題抽取：用gensim訓練LDA模型初試主題模型LDA-基於python的gensim包 gensim LDA模型提取每篇文檔所屬主題（概率最大主題所在） [NLP]LDA主題模型的python實現 LDA主題模型講解及代碼Python實現 LDA之主題模型代碼實現流程 python3 LDA主題模型以及TFIDF實現 python應用：主題分類（gensim lda） LDA主題模型

基於gensim的LDA主題模型實現 一鍵式函數打包

免責聲明！

基於gensim的LDA主題模型實現一鍵式函數打包