概述
1.concat(exprs: Column*): Column
function note: Concatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.
我的问题: dateframe中的某列数据"XX_BM", 例如:值为 0008151223000316, 现在我想 把Column("XX_BM")中的所有值 变为:例如:0008151223000316sfjd。
0008151223000316 + sfjd
解决方案: in Scala
var tmp = dfval.col("XX_BM")
var result = concat(tmp,lit("sfjd"))
dfval = dfval.withColumn("XX_BM", result)
2.regexp_replace(e: Column, pattern: String, replacement: String): Column
function note: Replace all substrings of the specified string value that match regexp with rep.
我的问题:I got some dataframe with 170 columns. In one column I have a "name" string and this string sometimes can have a special symbols like "'" that are not appropriate, when I am writing them to Postgres. Can I make something like that:【问题来自】
Df[$'name']=Df[$'name'].map(x => x.replaceAll("'","")) ?
但是:I don't want to parse full DataFrame,because it's very huge.Help me please
解决方案:You can't mutate DataFrames, you can only transform them into new DataFrames with updated values. In this case - you can use the regex_replace function to perform the mapping on name column:
import org.apache.spark.sql.functions._
val updatedDf = Df.withColumn("name", regexp_replace(col("name"), ",", ""))
3.regexp_replace(e: Column, pattern: Column, replacement: Column): Column
function note : Replace all substrings of the specified string value that match regexp with rep
最后
以上就是烂漫大山为你收集整理的dataframe scala 修改值_Spark DataFrame:提取某列并修改/ Column更新、替换的全部内容,希望文章能够帮你解决dataframe scala 修改值_Spark DataFrame:提取某列并修改/ Column更新、替换所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复