我已成功安装Nagios,并添加了以下服务:
nano /usr/local/nagios/etc/objects/services.cfg
define host {
use linux-server
host_name dcctst1e
address 160.81.38.74
}
define host {
use linux-server
host_name dcctst1f
address 160.81.38.75
}
define host {
use linux-server
host_name dcctst1g
address 160.81.38.76
}
define hostgroup {
hostgroup_name ganglia-servers
alias nagios server
members *
}
define servicegroup {
servicegroup_name ganglia-metrics
alias Ganglia Metrics
}
define command {
command_name check_ganglia
command_line $USER1$/check_ganglia.py -h $HOSTNAME$ -m $ARG1$ -w $ARG2$ -c $ARG3$ -o $ARG4$
}
define service{
use ganglia-service
service_description Free Memory
check_command check_ganglia!mem_free!200!50!1
contact_groups admins
}
define service{
use ganglia-service
service_description load_one
check_command check_ganglia!load_one!4!5!0
contact_groups admins
}
define service{
use ganglia-service
service_description disk_free
check_command check_ganglia!disk_free!40!50!0
contact_groups admins
}
define service{
use ganglia-service
service_description yarn.NodeManagerMetrics.AvailableGB
check_command check_ganglia!yarn.NodeManagerMetrics.AvailableGB!8!4!1
contact_groups admins
}
define service{
use ganglia-service
service_description Temperature
check_command check_ganglia!!8!4!1
contact_groups admins
}
如下面的屏幕截图所示, /usr/local/nagios/libexec/
中存在一些相关的插件(如 disk_free
, load_one
等 . 但是,在检查服务 disk_free
时,它会出错:
$ python check_ganglia.py
Usage: check_ganglia -h|--host= -m|--metric= -w|--warning= -c|critical= [-o|--opposite=] [-s|--server=] [-p|--port=]
$ python check_ganglia.py -h dcctst1f -m disk_free -w 40 -c 50 -o 1
CHECKGANGLIA UNKNOWN: Error while getting value"no element found: line1, column 0"
为什么是这样?我能做些什么才能让它发挥作用?
我还附上了网络版服务的截图;我预计他们的插件安装后会有3个服务为绿色 .
P.S.
由于Nagios网站中的错误说:
(在stdout上没有输出)stderr:/usr/local/nagios/libexec/check_ganglia.py:line 4:找不到导入命令
我评论了 check_ganglia.py
的第4行,即 import sys
. 当我再次运行该命令时,它会给出错误:
NameError:未定义名称sys
所以,显然,我无法评论 import sys
我该如何解决这个问题?