概述
Name the components on this agent
a1.sources=r1
a1.channels=c1
a1.sinks=k1
source
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.channels = c1
a1.sources.r1.batchSize = 5000
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = cdh01:9092,cdh02:9092,cdh03:9092
a1.sources.r1.kafka.topics = DayFreezingDataTest
channel1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
a1.channels.c1.byteCapacityBufferPercentage = 20
#官网说明
#Maximum total bytes of memory allowed as a sum of all events in this channel. The implementation only counts the Event body, which is the reason for providing the byteCapacityBufferPercentage configuration parameter as well. Defaults to a computed value equal to 80% of the maximum memory available to the JVM (i.e. 80% of the -Xmx value passed on the command line). Note that if you have multiple memory channels on a single JVM, and they happen to hold the same physical events (i.e. if you are using a replicating channel selector from a single source) then those event sizes may be double-counted for channel byteCapacity purposes. Setting this value to 0 will cause this value to fall back to a hard internal limit of about 200 GB.
#值太小会引发这个问题
#Cannot commit transaction. Byte capacity allocated to store event body 640000.0reached. Please increase heap space/byte capacity allocated to the channel as the sinks may not be keeping up with the sources
a1.channels.c1.byteCapacity = 800000
sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /event/flume/kafkaToHDFS/test/DayFreezingDataTest/%y-%m-%d/%H
a1.sinks.k1.hdfs.filePrefix = event-
表示一个小时生成一个文件夹
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.roundUnit = hour
a1.sinks.k1.hdfs.useLocalTimeStamp=true
a1.sinks.k1.hdfs.batchSize=1000
a1.sinks.k1.hdfs.fileType=DataStream
表示10分钟或者128M生成一个文件
a1.sinks.k1.hdfs.rollInterval=600
a1.sinks.k1.hdfs.rollSize=134217700
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.minBlockReplicas=1
最后
以上就是尊敬嚓茶为你收集整理的flume采集kafka数据到hdfs,不会产生大量小文件的配置Name the components on this agentsourcesink的全部内容,希望文章能够帮你解决flume采集kafka数据到hdfs,不会产生大量小文件的配置Name the components on this agentsourcesink所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复