我是靠谱客的博主 笑点低歌曲,最近开发中收集的这篇文章主要介绍Flume写入hive错误记录错误1、Failed to start agent because dependencies were not found in classpath. Error follows.错误2、Failed connecting to EndPoint,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

Flume 1.9
cdh 5.16.2

错误1、Failed to start agent because dependencies were not found in classpath. Error follows.

具体错误:

24 Mar 2022 10:58:07,452 ERROR [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:150)
- Failed to start agent because dependencies were not found in classpath. Error follows.
java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/streaming/RecordWriter
at org.apache.flume.sink.hive.HiveSink.createSerializer(HiveSink.java:220)
at org.apache.flume.sink.hive.HiveSink.configure(HiveSink.java:203)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:453)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:106)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:145)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.streaming.RecordWriter
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 13 more

原因:看报错是缺少jar包造成的。
解决:把$HIVE_HOME/hcatalog/share/hcatalog/ 下面hive-hcatalog-streaming.jar 拷贝到 $FLUME_HOME/lib/ 下面即可。

错误2、Failed connecting to EndPoint

具体错误信息:

22/03/24 11:09:26 WARN hive.HiveSink: k2 : Failed connecting to EndPoint {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:346)
at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:297)
at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.hcatalog.streaming.StreamingException: Cannot stream to table that has not been bucketed : {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:69)
at org.apache.hive.hcatalog.streaming.DelimitedInputWriter.<init>(DelimitedInputWriter.java:115)
at org.apache.flume.sink.hive.HiveDelimitedTextSerializer.createRecordWriter(HiveDelimitedTextSerializer.java:66)
at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:89)
... 6 more
22/03/24 11:09:26 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:269)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:346)
at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:297)
at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
... 3 more
Caused by: org.apache.hive.hcatalog.streaming.StreamingException: Cannot stream to table that has not been bucketed : {metaStoreUri='thrift://cdh-1:9083', database='ods', table='ods_flume_log', partitionVals=[20220324] }
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:69)
at org.apache.hive.hcatalog.streaming.DelimitedInputWriter.<init>(DelimitedInputWriter.java:115)
at org.apache.flume.sink.hive.HiveDelimitedTextSerializer.createRecordWriter(HiveDelimitedTextSerializer.java:66)
at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:89)
... 6 more

原因:Flume写入hive表时,需要hive表支持事务,所以hive表必须是事务表。

解决:创建事务表。

1、开启事务配置:
在hive命令行运行以下命令:

SET hive.support.concurrency = true;
SET hive.enforce.bucketing = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
SET hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.compactor.initiator.on = true;
SET hive.compactor.worker.threads = 1;

2、创建分区分桶表并开启事务

create table ods_flume_log(line string)
partitioned by (dt string)
clustered by (line) into 1 buckets
stored as orc tblproperties ('transactional'='true');

解释:
partitioned by (dt string) - 指定分区字段
clustered by (line) – 指定分桶的字段
stored as orc - 分桶格式orc
tblproperties (‘transactional’=‘true’) - tblproperties可以添加一些hive属性,这里是开启事务

我这里只有一个字段,因为按时间分区了,所以只设置了一个桶,各位看自己情况创建。

再次运行Flume后,数据正常写入hive中。

最后

以上就是笑点低歌曲为你收集整理的Flume写入hive错误记录错误1、Failed to start agent because dependencies were not found in classpath. Error follows.错误2、Failed connecting to EndPoint的全部内容,希望文章能够帮你解决Flume写入hive错误记录错误1、Failed to start agent because dependencies were not found in classpath. Error follows.错误2、Failed connecting to EndPoint所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(63)

评论列表共有 0 条评论

立即
投稿
返回
顶部