概述
fromgensimimportcorporaimportgensimfromgensim.models.ldamodelimportLdaModelfromgensim.parsing.preprocessingimportSTOPWORDS# example docsdoc1="""
Java (Indonesian: Jawa; Javanese: ꦗꦮ; Sundanese: ᮏᮝ) is an island of Indonesia.
With a population of over 141 million (the island itself) or 145 million (the
administrative region), Java is home to 56.7 percent of the Indonesian population
and is the most populous island on Earth.[1] The Indonesian capital city, Jakarta,
is located on western Java. Much of Indonesian history took place on Java. It was
the center of powerful Hindu-Buddhist empires, the Islamic sultanates, and the core
of the colonial Dutch East Indies. Java was also the center of the Indonesian struggle
for independence during the 1930s and 1940s. Java dominates Indonesia politically,
economically and culturally.
"""doc2="""
Hydrogen fuel is a zero-emission fuel when burned with oxygen, if one considers water
not to be an emission. It often uses electrochemical cells, or combustion in internal
engines, to power vehicles and electric devices. It is also used in the propulsion of
spacecraft and might potentially be mass-produced and commercialized for passenger vehicles
and aircraft.Hydrogen lies in the first group and first period in the periodic table, i.e.
it is the first element on the periodic table, making it the lightest element. Since
hydrogen gas is so light, it rises in the atmosphere and is therefore rarely found in
its pure form, H2."""doc3="""
The giraffe (Giraffa) is a genus of African even-toed ungulate mammals, the tallest living
terrestrial animals and the largest ruminants. The genus currently consists of one species,
Giraffa camelopardalis, the type species. Seven other species are extinct, prehistoric
species known from fossils. Taxonomic classifications of one to eight extant giraffe species
have been described, based upon research into the mitochondrial and nuclear DNA, as well
as morphological measurements of Giraffa, but the IUCN currently recognizes only one
species with nine subspecies.
"""documents=[doc1,doc2,doc3]document_wrd_splt=[[wordforwordindocument.lower().split()ifwordnotinSTOPWORDS]fordocumentindocuments]dictionary=corpora.Dictionary(document_wrd_splt)print(dictionary.token2id)corpus=[dictionary.doc2bow(text)fortextintexts]lda=LdaModel(corpus,num_topics=3,id2word=dictionary,passes=50)num_topics=3topic_words=[]foriinrange(num_topics):tt=lda.get_topic_terms(i,20)topic_words.append([dictionary[pair[0]]forpairintt])# output>>>topic_words[0]['indonesian','java','species','island','population','million','(the','java.','center','giraffe','currently','genus','city,','economically','administrative','east','sundanese:','itself)','took','1940s.']>>>topic_words[1]['vehicles','fuel','hydrogen','periodic','table,','i.e.','uses','form,','considers','zero-emission','internal','period','burned','cells,','rises','pure','atmosphere','aircraft.hydrogen','water','engines,']>>>topic_words[2]['giraffa,','even-toed','living','described,','camelopardalis,','consists','extinct,','seven','fossils.','morphological','terrestrial','(giraffa)','dna,','mitochondrial','nuclear','ruminants.','classifications','species,','prehistoric','known']
最后
以上就是专一眼睛为你收集整理的lda主题词评论python_如何将主题转换为python LDA中每个主题的前20个单词的列表的全部内容,希望文章能够帮你解决lda主题词评论python_如何将主题转换为python LDA中每个主题的前20个单词的列表所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复