我是靠谱客的博主 多情白昼,最近开发中收集的这篇文章主要介绍spark入门小例子,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

1,

pyspark

2,

spark-shell

spark网页管理页面:

http://127.0.0.1:4040/jobs/

3,

设置日志输出:

log4j.properties.template。把这个日志设置模版文件复制一份到conf/log4j.

properties 来作为日志设置文件,接下来找到下面这一行:

log4j.rootCategory=INFO, console

然后通过下面的设定降低日志级别,只显示警告及更严重的信息:

log4j.rootCategory=WARN, console

4,

修改spark临时文件存放路径:

 conf 下的spark-defaults.conf,增加如下一行:

spark.local.dir /diskb/sparktmp,/diskc/sparktmp,/diskd/sparktmp,/diske/sparktmp,/diskf/sparktmp,/diskg/sparktmp

说明:可配置多个目录,以 "," 分隔。

也配置spark-env.sh下增加

export SPARK_LOCAL_DIRS=spark.local.dir /diskb/sparktmp,/diskc/sparktmp,/diskd/sparktmp,/diske/sparktmp,/diskf/sparktmp,/diskg/sparktmp

如果spark-env.sh与spark-defaults.conf都配置,则SPARK_LOCAL_DIRS覆盖spark.local.dir 的配置

5,

运行spark程序的几种方式:


spark submit:


spark-submit

  --class

  --master [spark://207.184.161.138:7077]

  --deploy-mode []

  --conf []

  --executor-memory []

  --total-executor-cores [] 

  [application .jar ]

  [application args]

PI:

spark-submit --class org.apache.spark.examples.JavaSparkPi --master local[4] spark-example.jar


wordcount:

spark-submit --class org.apache.spark.examples.JavaWordCount --master local[4] spark-example.jar hdfs://localhost:9000/user/lenovo/wordcount/README.md


sql:

spark-submit --class org.apache.spark.examples.sql.JavaSparkSQLExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar


structed streaming:

下载netcat(https://eternallybored.org/misc/netcat/netcat-win32-1.12.zip) 

解压,将nc.exe拷贝到C:Windows下。

nc -l -p 9999

spark-submit --class org.apache.spark.examples.sql.streaming.StructuredNetworkWordCount --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar localhost 9999


spark streaming:

spark-submit --class org.apache.spark.examples.streaming.JavaNetworkWordCount --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar localhost 9999


随机森林:

spark-submit --class org.apache.spark.examples.ml.JavaRandomForestClassifierExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar


pipeline:

spark-submit --class org.apache.spark.examples.ml.JavaEstimatorTransformerParamExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

 spark-submit --class org.apache.spark.examples.ml.PipelineExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

来源:我是码农,转载请保留出处和链接!

本文链接:http://www.54manong.com/?id=1221

'); (window.slotbydup = window.slotbydup || []).push({ id: "u3646208", container: s }); })();
'); (window.slotbydup = window.slotbydup || []).push({ id: "u3646147", container: s }); })();

最后

以上就是多情白昼为你收集整理的spark入门小例子的全部内容,希望文章能够帮你解决spark入门小例子所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(50)

评论列表共有 0 条评论

立即
投稿
返回
顶部