概述
在运行了接近50天集群(期间集群没有重启过)之后,运行的是HQL脚本,就是一条简单的查询语句,集群报错,以下是报错的具体信息,最终的解决方案是:手动重启集群,解决了。
在重启集群时:发现不能运行sh stop-all.sh来关闭,会提示:
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [192.168.1.190]
192.168.1.190: no namenode to stop
192.168.1.191: no datanode to stop
192.168.1.193: no datanode to stop
192.168.1.194: no datanode to stop
192.168.1.192: no datanode to stop
Stopping secondary namenodes [192.168.1.190]
192.168.1.190: no secondarynamenode to stop
stopping yarn daemons
no resourcemanager to stop
192.168.1.191: no nodemanager to stop
192.168.1.193: no nodemanager to stop
192.168.1.192: no nodemanager to stop
192.168.1.194: no nodemanager to stop
no proxyserver to stop
但是在各个从节点(datanode)上利用命令jps查看时,发现各个进行都是存在的,就是不能stop,
究其原因是因为,其余节点和主节点失去了通信,此时只能一台机器一台机器去手动kill响应的进程
然后重启集群,再测试一遍HQL,报错信息就不再出现了,是什么导致主从节点之间的通信断开,
在这一点上,目前只能归咎于hadoop集群的不稳定性。
===========以下是报错信息===========
Logging initialized using configuration in file:/opt/hive/apache-hive-1.2.1-bin/conf/hive-log4j.properties
OKTime taken: 1.052 seconds
Query ID = hadoop_20161102163133_5e027a14-3452-4278-9057-e0a244a61952
Total jobs = 1
16/11/02 16:31:37 WARN conf.HiveConf: HiveConf of name hive.files.umask.value does not exist
Execution log at: /tmp/hadoop/hadoop_20161102163133_5e027a14-3452-4278-9057-e0a244a61952.log
2016-11-02 16:31:37 Starting to launch local task to process map join; maximum memory = 508559360
2016-11-02 16:31:38 Dump the side-table for tag: 0 with group count: 103 into file: file:/tmp/hive/local/24a2fae4-017e-4555-a7c5-6bc9a13419e5/hive_2016-11-02_16-31-33_691_9110187415324308596-1/-local-10002/HashTable-Stage-4/MapJoin-mapfile00--.hashtable
2016-11-02 16:31:38 Uploaded 1 File to: file:/tmp/hive/local/24a2fae4-017e-4555-a7c5-6bc9a13419e5/hive_2016-11-02_16-31-33_691_9110187415324308596-1/-local-10002/HashTable-Stage-4/MapJoin-mapfile00--.hashtable (945102 bytes)
2016-11-02 16:31:38 End of local task; Time Taken: 1.215 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1476427217749_1066/libjars/janino-2.7.6.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and 4 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
J ob Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1476427217749_1066/libjars/janino-2.7.6.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and 4 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
最后
以上就是愉快大侠为你收集整理的hadoop2.7.2集群运行HQL时,异常Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException的全部内容,希望文章能够帮你解决hadoop2.7.2集群运行HQL时,异常Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复