我是靠谱客的博主 开朗春天,最近开发中收集的这篇文章主要介绍Flume Processors,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

Processor概述

  • Processor是Flume用于实现失败恢复和负载均衡的组件。
  • 在企业级开发中,通常有多个客户端Agent来收集数据,发送给中心服务器Agent,中心服务器Agent要承载若干客户端Agent发送的数据,负载较高,且中心服务器Agent如果只有一个,会造成单节点故障风险。
  • 所以在企业级开发中,中心服务器Agent往往不止一个,由若干个协同工作,此时客户端Agent如何分配数据给中心服务器Agent就成了问题。
  • 需要为多个中心服务器配置Sink,将这些Sink组成SinkGroup组,再为这个组配置Processor,指定处理机制和其他参数。
    之后将这些Sink连接到同一个Channel,Processor可以通过改变Channel的指向,将数据根据规则实现分发。
  • processor的工作模式:失败恢复(故障转移) 和 负载均衡两种模式。
    == Failover Sink Processor(故障转移)
    processor.type= failover
    启动了多个,但是工作的只有一个,只有active状态进程死掉,其他才可能接替工作。
    那么多个有多个sink到底谁先工作,根据权重来,谁的权重高,谁先干活
    一般故障转移的话,2个sink的类型不一样(HDFS sink ,file sink)
    比如往HDFS写数据,HDFS宕机了,数据不丢失,往文件里写
    Load balancing Sink Processor(负载均衡)
    processor.type=load_balance
    processor.selector = round_robin(轮询)|random(随机)
    负载均衡与故障转移,只能实现一个,不能同时实现,往往选择负载均衡 ==

作者:一个专注的小白
来源:CSDN
原文:https://blog.csdn.net/weixin_43652369/article/details/84646632
版权声明:本文为博主原创文章,转载请附上博文链接!

Flume Sink Processors

Sink groups allow users to group multiple sinks into one entity. Sink processors can be used to provide load balancing capabilities over all sinks inside the group or to achieve fail over from one sink to another in case of temporal failure.
接收组允许用户将多个接收器分组到一个实体中。 接收器处理器可用于在组内的所有接收器上提供负载平衡功能,或在时间故障的情况下实现从一个接收器到另一个接收器的故障转移。
Required properties are in bold.

Property NameDefaultDescription
sinksSpace-separated list of sinks that are participating in the group
processor.typedefaultThe component type name, needs to be default, failover or load_balance

Example for agent named a1:

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = load_balance

Default Sink Processor

Default sink processor accepts only a single sink. User is not forced to create processor (sink group) for single sinks. Instead user can follow the source - channel - sink pattern that was explained above in this user guide.
默认接收器只接受一个接收器。 用户不必为单个接收器创建处理器(接收器组)。 相反,用户可以遵循本用户指南中上面解释的源 - 通道 - 接收器模式。

Failover Sink Processor

Failover Sink Processor maintains a prioritized list of sinks, guaranteeing that so long as one is available events will be processed (delivered).
故障转移接收器维护一个优先级的接收器列表,保证只要有一个可用的事件将被处理(传递)。
The failover mechanism works by relegating failed sinks to a pool where they are assigned a cool down period, increasing with sequential failures before they are retried. Once a sink successfully sends an event, it is restored to the live pool. The Sinks have a priority associated with them, larger the number, higher the priority. If a Sink fails while sending a Event the next Sink with highest priority shall be tried next for sending Events. For example, a sink with priority 100 is activated before the Sink with priority 80. If no priority is specified, thr priority is determined based on the order in which the Sinks are specified in configuration.
故障转移机制的工作原理是将故障接收器降级到池中,在池中为它们分配一个冷却期,在重试之前随顺序故障而增加。 接收器成功发送事件后,它将恢复到实时池。 接收器具有与之相关的优先级,数量越大,优先级越高。 如果在发送事件时接收器发生故障,则接下来将尝试下一个具有最高优先级的接收器以发送事件。 例如,在优先级为80的接收器之前激活优先级为100的接收器。如果未指定优先级,则根据配置中指定接收器的顺序确定thr优先级。
To configure, set a sink groups processor to failover and set priorities for all individual sinks. All specified priorities must be unique. Furthermore, upper limit to failover time can be set (in milliseconds) using maxpenalty property.
要进行配置,请将接收器组处理器设置为“failover”并为所有单个接收器设置优先级。 所有指定的优先级必须是唯一的 此外,可以使用maxpenalty属性设置故障转移时间的上限(以毫秒为单位)。
Required properties are in bold.

Property NameDefaultDescription
sinksSpace-separated list of sinks that are participating in the group
processor.typedefaultThe component type name, needs to be failover
processor.priority.Priority value. must be one of the sink instances associated with the current sink group A higher priority value Sink gets activated earlier. A larger absolute value indicates higher priority
processor.maxpenalty30000The maximum backoff period for the failed Sink (in millis)

Example for agent named a1:

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
a1.sinkgroups.g1.processor.maxpenalty = 10000

Load balancing Sink Processor

Load balancing sink processor provides the ability to load-balance flow over multiple sinks. It maintains an indexed list of active sinks on which the load must be distributed. Implementation supports distributing load using either via round_robin or random selection mechanisms. The choice of selection mechanism defaults to round_robin type, but can be overridden via configuration. Custom selection mechanisms are supported via custom classes that inherits from AbstractSinkSelector.
负载平衡接收器处理器提供了在多个接收器上进行负载均衡流量的功能。 它维护一个索引的活动接收器列表,必须在其上分配负载。 实现支持使用round_robinrandom选择机制来分配负载。 选择机制的选择默认为round_robin类型,但可以通过配置覆盖。 通过继承自AbstractSinkSelector的自定义类支持自定义选择机制。
When invoked, this selector picks the next sink using its configured selection mechanism and invokes it. For round_robin and random In case the selected sink fails to deliver the event, the processor picks the next available sink via its configured selection mechanism. This implementation does not blacklist the failing sink and instead continues to optimistically attempt every available sink. If all sinks invocations result in failure, the selector propagates the failure to the sink runner.
调用时,此选择器使用其配置的选择机制选择下一个接收器并调用它。 对于round_robinrandom如果所选的接收器无法传递事件,则处理器通过其配置的选择机制选择下一个可用的接收器。 此实现不会将失败的接收器列入黑名单,而是继续乐观地尝试每个可用的接收器。 如果所有接收器调用都导致失败,则选择器将故障传播到接收器运行器。
If backoff is enabled, the sink processor will blacklist sinks that fail, removing them for selection for a given timeout. When the timeout ends, if the sink is still unresponsive timeout is increased exponentially to avoid potentially getting stuck in long waits on unresponsive sinks. With this disabled, in round-robin all the failed sinks load will be passed to the next sink in line and thus not evenly balanced
如果启用了“退避”,则接收器处理器会将失败的接收器列入黑名单,将其删除以供给定超时的选择。 当超时结束时,如果接收器仍然没有响应,则超时会以指数方式增加,以避免在无响应的接收器上长时间等待时卡住。 在禁用此功能的情况下,在循环中,所有失败的接收器负载将被传递到下一个接收器,因此不均衡
Required properties are in bold.

Property NameDefaultDescription
processor.sinksSpace-separated list of sinks that are participating in the group
processor.typedefaultThe component type name, needs to be load_balance
processor.backofffalseShould failed sinks be backed off exponentially.
processor.selectorround_robinSelection mechanism. Must be either round_robin, random or FQCN of custom class that inherits from AbstractSinkSelector
processor.selector.maxTimeOut30000Used by backoff selectors to limit exponential backoff (in milliseconds)

Example for agent named a1:

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector = random

最后

以上就是开朗春天为你收集整理的Flume Processors的全部内容,希望文章能够帮你解决Flume Processors所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(65)

评论列表共有 0 条评论

立即
投稿
返回
顶部