设置Flume监听文件内容应用场景操作方案

360 阅读 0 评论 238 点赞

我是靠谱客的博主故意香水，这篇文章主要介绍设置Flume监听文件内容应用场景操作方案，现在分享给大家，希望可以做个参考。

原文地址为：设置Flume监听文件内容

应用场景

按照Hadoop完全分布式安装Flume博文，测试使用了Flume监听文件夹，当文件夹中添加了文件，Flume设置会立马进行收集文件夹中的添加的文件，那么这是一种应用场景，但是如果我们想收集文件中的内容，该如何办呢？比如，linux目录下有一个文件，我会往这个文件里不断的新增内容，那么怎么才能实时写入到HDFS呢？

操作方案

Hadoop完全分布式安装Flume博文，中监控文件夹，如果linux目录的文件夹下，有文件添加，那么会自动采集到HDFS目录，如果需要监控具体的文件内容，如果该文件中有数据更新，那么需要修改flume-conf.properties文件为如下，其他不变！

 # cd /opt/flume1.7.0/conf
# vim flume-conf.properties

# a.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec 
a1.sources.r1.command = tail -F /opt/log/exec.text
a1.sources.r1.fileHeader = true
a1.sources.r1.deserializer.outputCharset=UTF-8
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://hadoop0:9000/log
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat=Text
a1.sinks.k1.hdfs.maxOpenFiles = 1
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 1000000
a1.sinks.k1.hdfs.batchSize = 100000
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 100000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

 # cd /opt/flume1.7.0/
# bin/flume-ng agent --conf conf --conf-file conf/flume-conf.properties --name a1 -Dflume.root.logger=INFO,console

转载请注明本文地址：设置Flume监听文件内容

最后

以上就是故意香水最近收集整理的关于设置Flume监听文件内容应用场景操作方案的全部内容，更多相关设置Flume监听文件内容应用场景操作方案内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：设置
浏览次数：360 次浏览
发布日期：2023-12-13 02:35:14

设置Flume监听文件内容应用场景操作方案

应用场景

操作方案

最后

评论列表共有 0 条评论

发表评论取消回复

设置Flume监听文件内容应用场景操作方案

应用场景

操作方案

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复