我是靠谱客的博主 专一眼睛,最近开发中收集的这篇文章主要介绍lda主题词评论python_如何将主题转换为python LDA中每个主题的前20个单词的列表,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

fromgensimimportcorporaimportgensimfromgensim.models.ldamodelimportLdaModelfromgensim.parsing.preprocessingimportSTOPWORDS# example docsdoc1="""

Java (Indonesian: Jawa; Javanese: ꦗꦮ; Sundanese: ᮏᮝ) is an island of Indonesia.

With a population of over 141 million (the island itself) or 145 million (the

administrative region), Java is home to 56.7 percent of the Indonesian population

and is the most populous island on Earth.[1] The Indonesian capital city, Jakarta,

is located on western Java. Much of Indonesian history took place on Java. It was

the center of powerful Hindu-Buddhist empires, the Islamic sultanates, and the core

of the colonial Dutch East Indies. Java was also the center of the Indonesian struggle

for independence during the 1930s and 1940s. Java dominates Indonesia politically,

economically and culturally.

"""doc2="""

Hydrogen fuel is a zero-emission fuel when burned with oxygen, if one considers water

not to be an emission. It often uses electrochemical cells, or combustion in internal

engines, to power vehicles and electric devices. It is also used in the propulsion of

spacecraft and might potentially be mass-produced and commercialized for passenger vehicles

and aircraft.Hydrogen lies in the first group and first period in the periodic table, i.e.

it is the first element on the periodic table, making it the lightest element. Since

hydrogen gas is so light, it rises in the atmosphere and is therefore rarely found in

its pure form, H2."""doc3="""

The giraffe (Giraffa) is a genus of African even-toed ungulate mammals, the tallest living

terrestrial animals and the largest ruminants. The genus currently consists of one species,

Giraffa camelopardalis, the type species. Seven other species are extinct, prehistoric

species known from fossils. Taxonomic classifications of one to eight extant giraffe species

have been described, based upon research into the mitochondrial and nuclear DNA, as well

as morphological measurements of Giraffa, but the IUCN currently recognizes only one

species with nine subspecies.

"""documents=[doc1,doc2,doc3]document_wrd_splt=[[wordforwordindocument.lower().split()ifwordnotinSTOPWORDS]fordocumentindocuments]dictionary=corpora.Dictionary(document_wrd_splt)print(dictionary.token2id)corpus=[dictionary.doc2bow(text)fortextintexts]lda=LdaModel(corpus,num_topics=3,id2word=dictionary,passes=50)num_topics=3topic_words=[]foriinrange(num_topics):tt=lda.get_topic_terms(i,20)topic_words.append([dictionary[pair[0]]forpairintt])# output>>>topic_words[0]['indonesian','java','species','island','population','million','(the','java.','center','giraffe','currently','genus','city,','economically','administrative','east','sundanese:','itself)','took','1940s.']>>>topic_words[1]['vehicles','fuel','hydrogen','periodic','table,','i.e.','uses','form,','considers','zero-emission','internal','period','burned','cells,','rises','pure','atmosphere','aircraft.hydrogen','water','engines,']>>>topic_words[2]['giraffa,','even-toed','living','described,','camelopardalis,','consists','extinct,','seven','fossils.','morphological','terrestrial','(giraffa)','dna,','mitochondrial','nuclear','ruminants.','classifications','species,','prehistoric','known']

最后

以上就是专一眼睛为你收集整理的lda主题词评论python_如何将主题转换为python LDA中每个主题的前20个单词的列表的全部内容,希望文章能够帮你解决lda主题词评论python_如何将主题转换为python LDA中每个主题的前20个单词的列表所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(49)

评论列表共有 0 条评论

立即
投稿
返回
顶部