概述
1.配置环境说明
spark:5台zybo板,192.168.1.1master,其它4台为slave
hadoop:192.168.1.1(外接SanDisk )
2.单节点hadoop测试:
如果出现内存不足情况如下:
查看当前虚拟内存容量:
free -m
cd /mnt
mkdir swap
cd swap/
创建一个swap文件
dd if=/dev/zero of=swapfile bs=1024 count=1000000
把生成的文件转换成swap文件
mkswap swapfile
激活swap文件
swapon swapfile
free -m
通过测试:
3.spark + hadoop 测试
SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.sh
MASTER=spark://192.168.1.1:7077 ./bin/pyspark
file = sc.textFile("hdfs://192.168.1.1:9000/in/file")
counts = file.flatMap(lambda line: line.split(" "))
.map(lambda word: (word, 1))
.reduceByKey(lambda a, b: a + b)
counts.saveAsTextFile("hdfs://192.168.1.1:9000/out/mycount")
counts.saveAsTextFile("/mnt/mycount")
counts.collect()
counts.collect()
错误1:
java.net.ConnectException: Call From zynq/192.168.1.1 to spark1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
这是由于我们用root启动hadoop,而spark要远程操作hadoop系统,没有权限引起的
解决:如果是测试环境,可以取消hadoop hdfs的用户权限检查。打开etc/hadoop/hdfs-site.xml,找到dfs.permissions属性修改为false(默认为true)OK了。
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
4.附:我的配置文件
go.sh:
#! /bin/sh -
mount /dev/sda1 /mnt/
cd /mnt/swap/
swapon swapfile
free -m
cd /root/hadoop-2.4.0/
sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh start datanode
sbin/hadoop-daemon.sh start secondarynamenode
sbin/yarn-daemon.sh start resourcemanager
sbin/yarn-daemon.sh start nodemanager
sbin/mr-jobhistory-daemon.sh start historyserver
jps
while [ `netstat -ntlp | grep 9000` -eq `echo` ]
do
sleep 1
done
netstat -ntlp
echo hadoop start successfully
cd /root/spark-0.9.1-bin-hadoop2
SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.sh
jps
while [ `netstat -ntlp | grep 7077` -eq `echo` ]
do
sleep 1
done
netstat -ntlp
echo spark start successfully
/etc/hosts
#127.0.0.1 localhost zynq
192.168.1.1 spark1 localhost zynq
#192.168.1.1 spark1
192.168.1.2 spark2
192.168.1.3 spark3
192.168.1.4 spark4
192.168.1.5 spark5
192.168.1.100 sparkMaster
#::1 localhost ip6-localhost ip6-loopback
/etc/profile
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$PATH
export JAVA_HOME=/usr/lib/jdk1.7.0_55
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_HOME=/root/hadoop-2.4.0
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
ifconfig eth2 hw ether 00:0a:35:00:01:01
ifconfig eth2 192.168.1.1/24 up
HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/hadoop/tmp</value>
</property>
</configuration>
HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>192.168.1.1:9000</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/mnt/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/mnt/namenode</value>
</property>
</configuration>
done
转载于:https://www.cnblogs.com/shenerguang/p/3834006.html
最后
以上就是无辜小霸王为你收集整理的Learn ZYNQ(10) – zybo cluster word count的全部内容,希望文章能够帮你解决Learn ZYNQ(10) – zybo cluster word count所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复