概述
在xcat2中存在两种监控的基本方式,其中一种方式是将监控器作为插件嵌入到xcat中,这些插件包括ganglia,rmc,snmp等,xCAT Notification Infrastructure 可以让你察看xcat中数据库的变化来监控xcat集群。
在xcat2中,你可以在xcat集集群中集成第三方监控软件。嵌入式机制将XCAT和第三方集成起来。xcat提供了一系列的最常用的监控软件:
- xCAT (xcatmon.pm) (monitoring node statue using fping. released)
- SNMP (snmpmon.pm) (snmp monitoring. released)
- RMC (rmcmon.pm) (released)
- Ganglia (gangliamon.pm) (released)
- Nagios (nagiosmon.pm)
- Performance Co-pilot (pcpmon.pm)
同样,你也可以自己编写监控软件。你可以从以上的监控软件中选取其中一个或者多个来监控xcat的集群。
xCAT Monitoring Commands
在xcat中,存在如下8个监控命令。
Command | Description |
monls | list the current or all the monitoring plug-in names, their status and description. |
monadd | add a monitoring plug-in to the 'monitoring' table. This will also adds the configuration scripts for the monitoring plug-in, if any, to the 'postscripts' table. |
monrm | remove a monitoring plug-in from the 'monitoring' table. It also removes the configuration scripts for the monitoring plug-in from the 'postscripts' table. |
moncfg | configure the 3rd party monitoring software on the management server and the service node for the given nodes to include the nodes into the monitoring domain. It does all the necessary configuration changes to prepare the software for monitoring the nodes. The -r option will configure the nodes as well. |
mondecfg | deconfigure the 3rd party monitoring software on the management server and the service node for the given nodes to remove the nodes from the monitoring domain. The -r option will deconfigure the nodes as well. |
monstart | start 3rd party software on the management server and the service node for the given nodes to monitor the xCAT cluster. It includes starting the daemons. The -r option will start the daemons on the nodes as well. |
monstop | stop 3rd party software on the management server and the service node for the given nodes from monitoring the xCAT cluster. The -r will stop the daemons on the nodes as well. |
monshow | displays the events that happened on the given nodes or the monitoring data that is collected from the given nodes.
|
Define monitoring servers
如果节点数量比较少或者习惯于是管理节点监控,你可以使用管理节点作为监控节点。对于大量的节点。建议指定一些节点作为监控信息汇集点。这些节点称为监控服务器。你可以使用service nodes(sn)作为监控服务器。监控服务器中的noderes表中monserver列中包含监控服务器信息。在monserver列中元素定义一对被逗号分隔的host name 和 ip address对。第一个host name 或者 ip address作为 连接管理节点的网络接口,第二个作为通向节点的网络接口。如果在monserver列中没有定义,则servicenode和xcatmaster列中的内容被使用,如果以上三个列皆为空,则管理节点(mn)被作为监控服务器使用。
Figure 1. Monitoring servers for the nodes
The noderes' table looks like this for the above cluster.
node | monservers | servicenode | xcatmaster |
---|---|---|---|
sn01 | 9.114.47.227 | 9.114.47.227 | |
sn02 | 9.114.47.227 | 9.114.47.227 | |
monsv02 | 9.114.47.227 | 9.114.47.227 | |
group1 | sn01 | 192.152.101.1 | |
group2 | monsv02, 192.152.101.3 | sn02 | 192.152.101.2 |
下面介绍一下xcat的一种监控器xcatmon的安装 。
xcatmon通过 AIX的fping和linux的nmap对节点状态进行监控。xcatmon同样提供程序状态监控。nodelist表中状态和程序状态会被同步更新。节点状态包括:
booting, netbooting, booted, discovering, configuring, installing, ping, standingby,powering-off, noping.
appstatus程序状态是指节点应用程序的状态,例如,
sshd=up,ftp=down,ll=down
xcatmon可以使用monsetting表来配置那些程序需要检查和怎样检查。monsetting表每行的配置格式,xcatmon作为name,apps作为key,value为一串逗号分隔开的应用程序名称。对于应用程序,你可以配置需要检查的节点端口来获得程序的运行状态;或者你可以配置一个调用的命令来获得节点的状态。这个命令可以是在本地运行的,也可以是在节点远程运行的。如果在表中没有程序被配置,默认的检查时ssh的状态。
下表为monsetting表的一个实例。
name | key | value |
---|---|---|
xcatmon | apps | ssh,ll,gpfs,someapp |
xcatmon | gpfs | cmd=/tmp/mycmd,group=compute,group=service |
xcatmon | ll | port=9616,group=compute |
xcatmon | someapp | dcmd=/tmp/somecmd |
xcatmon | someapp2 | lcmd=/tmp/somecmd2 |
xcatmon | ping-interval | 5 |
Keywords to use:
-
- apps -- a list of comma separated application names whose status will be queried. For how to get the status of each application, look for application name in the key filed in a different row.
- port -- the application daemon port number, if not specified, use internal list, then /etc/services.
- group -- the name of a node group that needs to get the application status from. If not specified, assume all the nodes in the nodelist table. To specify more than one groups, use group=a,group=b format.
- cmd -- the command that will be run locally on mn or sn.
- lcmd -- the command that will be run locally on the mn only.
- dcmd -- the command that will be run distributed on the nodes (xdsh <nodes> ...)
这些我就不翻译了。。
解释一下,比如apps中有ssh,ll,gpfs等,对应的gpfs的值为cmd=/tmp/mycmd,group=compute,group=service其中group为并行检查。
下面是安装xcatmon的步骤。
To enable xcatmon monitoring, perform the following steps:
1. Add the monitoring plug-in in the 'monitoring' table, where 5 means that the nodes are pinged for status every 5 minutes:
monadd xcatmon -n -s ping-interval=5
2. To activate, use the monstart command.
monstart xcatmon
3. Verify monitoring was started.
monls xcatmon
xcatmon monitored node-status-monitored
4. Check the settings:
tabdump monsetting
#name,key,value,comments,disable
"xcatmon","ping-interval","5",,
5. Make sure cron jobs are activated on mn and all monitoring server.
crontab -l
*/5 * * * * XCATROOT=/opt/xcat
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/opt/xcat/bin:/opt/xcat/sbin /opt/xcat/bin/nodestat all -m -u -q
原文下面还有个例子。在这里我就不再赘述了。大体跟上面的那个小例子一样,下面的那个添加了程序检查脚本。如果有需要可以自己查看。
地址为:
http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Monitoring_an_xCAT_Cluster#xcatmon
最后
以上就是欣喜哑铃为你收集整理的xCAT监控中xcatmon监控简介的全部内容,希望文章能够帮你解决xCAT监控中xcatmon监控简介所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复