我是靠谱客的博主 追寻水杯,最近开发中收集的这篇文章主要介绍scala条件替换_Scala:如何使用Scala替换Dataframs中的值,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

For example I want to replace all numbers equal to 0.2 in a column to 0. How can I do that in Scala? Thanks

Edit:

|year| make|model| comment |blank|

|2012|Tesla| S | No comment | |

|1997| Ford| E350|Go get one now th...| |

|2015|Chevy| Volt| null | null|

This is my Dataframe I'm trying to change Tesla in make column to S

解决方案

Note:

As mentionned by Olivier Girardot, this answer is not optimized and the withColumn solution is the one to use (Azeroth2b answer)

Can not delete this answer as it has been accepted

Here is my take on this one:

val rdd = sc.parallelize(

List( (2012,"Tesla","S"), (1997,"Ford","E350"), (2015,"Chevy","Volt"))

)

val sqlContext = new SQLContext(sc)

// this is used to implicitly convert an RDD to a DataFrame.

import sqlContext.implicits._

val dataframe = rdd.toDF()

dataframe.foreach(println)

dataframe.map(row => {

val row1 = row.getAs[String](1)

val make = if (row1.toLowerCase == "tesla") "S" else row1

Row(row(0),make,row(2))

}).collect().foreach(println)

//[2012,S,S]

//[1997,Ford,E350]

//[2015,Chevy,Volt]

You can actually use directly map on the DataFrame.

So you basically check the column 1 for the String tesla.

If it's tesla, use the value S for make else you the current value of column 1

Then build a tuple with all data from the row using the indexes (zero based) (Row(row(0),make,row(2))) in my example)

There is probably a better way to do it. I am not that familiar yet with the Spark umbrella

最后

以上就是追寻水杯为你收集整理的scala条件替换_Scala:如何使用Scala替换Dataframs中的值的全部内容,希望文章能够帮你解决scala条件替换_Scala:如何使用Scala替换Dataframs中的值所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(39)

评论列表共有 0 条评论

立即
投稿
返回
顶部