概述
1.创建DF
val documentDF = sqlContext.createDataFrame(Seq( "Hi I heard about Spark".split(" "), "I wish Java could use case classes".split(" "), "Logistic regression models are neat".split(" ") ).map(Tuple1.apply)).toDF("text")
JSON的结构:
{"text":["I","wish","Java","could","use","case","classes"]}
{"text":["Logistic","regression","models","are","neat"]}
{"text":["Hi","I","heard","about","Spark"]}
2.创建word2vec
val word2Vec = new Word2Vec() .setInputCol("text") .setOutputCol("result") .setVectorSize(3) .setMinCount(0)
setVectorSize:把一个words组转换成多少纬度的向量,我们这里选择三个
3.model
val model = word2Vec.fit(documentDF) val result = model.transform(documentDF) result.select("result").take(3).foreach(println)
4.
scala> result.select("result").take(3).foreach(println)
[[-7.559644058346749E-4,-0.0235147787258029,9.437099099159241E-4]]
[[-0.06844028996835862,-0.029905967015240873,0.07320201684654291]]
[[0.006268330290913582,0.02445013374090195,0.06141428500413895]]
最后
以上就是彪壮哑铃为你收集整理的spark ML 使用Word2Vec的全部内容,希望文章能够帮你解决spark ML 使用Word2Vec所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复