centos5.5下安装nagios与fetion
为了稳定起见,我用的比较成熟的老版本程序,主程序:nagios3.0.6
yum -y install httpd gcc glibc glibc-common gd gd-devel
yum -y install openssl-devel(不做这步,安装nrpe会出现checking for SSL headers... configure: error: Cannot find ssl headers错误)
先安装好 hhtpd gcc gd 库等依赖程序。
以下操作均在nagios主程序所在机器进行。
安装前准备:
1.新建nagios用户及用户组
useradd nagios
password nagios (设置密码)
2.修改安装文件夹权限
chown nagios.nagios /usr/local/nagios
一、安装nagios主程序
tar -zxvf nagios-3.0.6.tar.gz
cd nagios-3.0.6
./configure –prefix=/usr/local/nagios –with-command-group=nagios
make all
make install
make install-init
make install-config
make install-commandmode
ls /usr/local/nagios (查看是否有etc、bin、sbin、share、var、libexec这六个目录,如果有,表示安装成功)
cd ..
二、安装nagios-plugins插件
1、tar -zxvf nagios-plugins-1.4.9.tar.gz
cd nagios-plugins-1.4.9
./configure --prefix=/usr/local/nagios --with-nagios-user-nagios --with-nagios-group=nagios
make
make install
ls /usr/local/nagios/libexec(会显示很多插件)
2、将apache的运行用户加到nagios组里面
从httpd.conf中过滤出当前的apache运行用户:
grep ^User /etc/httpd/conf/httpd.conf
User apache(返回值)
我的是apache,下面将这个用户加入nagios组
usermod -G nagios apache
3、修改apache配置文件
vi /etc/httpd/conf/httpd.conf
shift+g 跳至文件最后,并加入如下内容:
#setting for nagios 20090325
#setting by https://www.360docs.net/doc/6914673931.html,
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
Alias /nagios /usr/local/nagios/share
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
保存后,/etc/init/httpd restart 重启apache服务。
4、增加web访问账号
/usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd pengjieyu
New password: (输入admin)
Re-type new password: (再输入一次admin)
Adding password for user pengjieyu
查看访问账号文件的内容:
less /usr/local/nagios/etc/htpasswd
pengjieyu:dBJawlMtEuqck前面是用户名pengjieyu,后面是加密后的密码
ctrl+z 退出
5、修改cgi.cfg文件
编辑cig.cfg文件,将开始建立的用户名pengjieyu添加到里面,允许该账户通过web登陆(如果有多个登陆账号,在后面用,号隔开)。
vi /usr/local/nagios/etc/cgi.cfg
authorized_for_system_information=pengjieyu
authorized_for_configuration_information=pengjieyu
authorized_for_system_commands=pengjieyu
authorized_for_all_services=pengjieyu
authorized_for_all_hosts=nagiosadmin,pengjieyu
authorized_for_all_service_commands=pengjieyu
authorized_for_all_host_commands=pengjieyu
6、将nagios加入到服务列表,方便nagios随系统启动
chkconfig --add nagios
chkconfig nagios on
验证配置样例文件:/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.c fg
如果不报错,可以启动nagios服务:service nagios start
7、更改SELinux设置
Fedora 与 SELinux(安全增强型 Linux)同步发行与安装后将默认使用强制模式。这会在你尝试联入 Nagios 的 CGI 时导致一个"内部服务错误"消息。
如果是 SELinux 处于强制安全模式时需要做
Getenforce
令 SELinux 处于容许模式
setenforce 0
如果要永久性更变它,需要更改/etc/selinux/config 里的设置并重启系统。
不关闭 SELinux 或是永久性变更它的方法是让 CGI 模块在 SELinux 下指定强制目标模式:chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/
chcon -R -t httpd_sys_content_t /usr/local/nagios/share/
8、测试
IE浏览器地址栏输入:192.168.0.76/nagios,敲入用户名及密码,就可以看到如下界面了
Nagios监控端配置到此暂时告一段落,接下来配置被监控端。
切换到被监控端:(监控端也需要安装,安装方法一样,我就不复述了)
三、使用nrpe插件监控远程linux服务器的本地信息
1、增加用户nagios
useradd nagios
passwd nagios
2、安装 nagios 插件
tar ‐zxvf nrpe‐2.12.tar.gz
cd nrpe‐2.12
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
3、启动:/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
4、检测:/usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12(我的版本号)
返回nrpe版本号,就说明安装成功
5、安装xinetd脚本
make install-xinetd
vi /etc/xinetd.d/nrpe
找到这一句:only_from = 127.0.0.1
将这句话改成:only_from = 127.0.0.1 192.168.0.76(nagios主程序所在的服务器IP)
6、编辑/etc/services文件,增加NRPE服务
vi /etc/services
shift+g 跳到最后,增加如下内容:
nrpe 5666/tcp # nrpe
保存后重启xinetd:service xinetd restart
Nagios主程序配置的修改
现在回到主监控端(192.168.0.77)上面。
一、编辑nagios.cfg
1、vi /usr/local/nagios/nagios.cfg
找到cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
将其注释掉:#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
同时增加下面这几行:
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
cfg_file=/usr/local/nagios/etc/objects/hostgroup.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/contactgroup.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg
2、保存并退出。
二、增加hosts.cfg文件
1、vi /usr/local/nagios/etc/objects/hosts.cfg
写入如下内容:
##########################################################################
### Define whole host for all the machines
# Define testgroup host for the testers machine
define host{
host_name master1(master1后面千万不要有空格) alias master51
max_check_attempts 5
contact_groups lxr
address 192.168.0.51
}
define host{
host_name master2(master2后面千万不要有空格) alias master52
max_check_attempts 5
contact_groups lxr
address 192.168.0.52
}
define host{
host_name master3(master3后面千万不要有空格) alias master53
max_check_attempts 5
contact_groups lxr
address 192.168.0.53
}
上面定义的大概意思是说:
增加了三台IP地址为 192.168.0.51-192.168.0.53的被监控机,被监控机的hostname叫做master(必须是真实、有效的的机器名),该机器出问题后,发消息到联系人组:lxr (contact_groups lxr,这个联系人组必须在后面定义)。
2、修改hostgroup.cfg
vi /usr/local/nagios/etc/objects/hostgroup.cfg
写入如下内容:
##############################################################################
### Define all hostgroup for the whole machine
# Define testgroup
define hostgroup{
hostgroup_name linux
alias master
members master1,matser2,master3
}
define hostgroup定义的意思是说:
主机组名字为linux,它的别名叫做master,成员有master1,master2,master3(不同的主机名之间必须用,号隔开)。
3、修改services.cfg
vi /usr/local/nagios/etc/objects/services.cfg
写入如下内容:
define service{
host_name master1
service_description check_swap
check_period 24x7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
notification_options w,u,c,r
notification_interval 1
notification_period 24x7
check_command check_nrpe!check_swap
contact_groups lxr
}
define service{
host_name master1
service_description check_http
check_period 24x7
max_check_attempts 5
normal_check_interval 1
retry_check_interval 1
notification_options w,u,c,r
notification_interval 1
notification_period 24x7
check_command check_nrpe!check_http!80
contact_groups lxr
}
define service{
host_name master2
service_description check_swap
check_period 24x7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
notification_options w,u,c,r
notification_interval 1
notification_period 24x7
check_command check_nrpe!check_swap
contact_groups lxr
}
define service{
host_name master2
service_description check_http
check_period 24x7
max_check_attempts 5
normal_check_interval 1
retry_check_interval 1
notification_options w,u,c,r
notification_interval 1
notification_period 24x7
check_command check_nrpe!check_http!80
contact_groups lxr
}
我这里只是监控了master1和master2的http服务跟swap分区使用情况,如果要增加其他的服务,可以参考上述服务的模式去定义。值得注意的是:host_name 的名字必须是在 hosts.cfg 里面定义过的,contact_groups也必须是在后面的contactgroups里面定义的。
4、修改contacts.cfg
vi /usr/local/nagios/etc/objects/contacts.cfg
写入如下内容:
###############################################################################
# CONTACTS.CFG - SAMPLE CONTACT/CONTACTGROUP DEFINITIONS
#
# Last Modified: 05-31-2007
#
# NOTES: This config file provides you with some example contact and contact
# group definitions that you can reference in host and service
# definitions.
#
# You don't need to keep these definitions in a separate file from your
# other object definitions. This has been done just to make things
# easier to understand.
#
###############################################################################
############################################################################### ############################################################################### #
# CONTACTS
#
############################################################################### ############################################################################### ### Define contact information for all the contacter
# Define contact information for pjy
define contact{
contact_name pjy
use generic-contact
alias pjy-admin
service_notification_commands
notify-service-by-email,service-notify-by-fetion(报警方式为email和fetion)
host_notification_commands
notify-host-by-email,host-notify-by-fetion(报警方式为email和fetion)
email 77109****@https://www.360docs.net/doc/6914673931.html,
pager 1357415****
}
保存退出
上面文件定义的内容是:
定义一个联系人pjy,当service和host出现问题是,用email和fetion方式给pjy报警。需要注意的是,接收fetion报警的号码,必须与发送飞信的号码互为飞信好友,否则将接收不到任何消息。
5、定义contactgroup.cfg
vi /usr/local/nagios/etc/objects/contactgroup.cfg
写入如下内容:
###############################################################################
###############################################################################
#
# CONTACT GROUPS
#
###############################################################################
###############################################################################
### Define contact group for all ther whole contacter
# Define testers contact group
define contactgroup{
contactgroup_name lxr
alias pjy
members pjy
}
该文件定义的就是联系人组了,联系人组的名字叫做lxr,组员包括pjy(如果有多个联系人,请用,号隔开)。
6、修改commands.cfg
vi /usr/local/nagios/etc/objects/commands.cfg
shift+g 跳到文件最后面,添加如下内容:
# 'notify-host-by-fetion' command definition
define command {
command_name host-notify-by-fetion ;
command_line /usr/local/feixin/fx/fetion --mobile=1357415**** --pwd=feishu8 --to=1357415**** --msg-utf8="Host $HOSTSTATE$ alert for $HOSTNAME$! on '$LONGDATETIME$'" $CONTACTPAGER$
}
# 'notify-service-by-fetion' command definition
define command {
command_name service-notify-by-fetion
command_line /usr/local/feixin/fx/fetion --mobile=1357415**** --pwd=feishu8 --to=1357415**** --msg-utf8="$HOSTADDRESS$ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ on $LONGDATETIME$" $CONTACTPAGER$
}
# ' check_nrpe ' command definition
define command{
command_name check_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
保存退出
'notify-host-by-fetion' command definition 定义的是host出现故障时,通过fetion报警/usr/local/feixin/fx/fetion(fetion安装路径) --mobile=1357415****(发送飞信的号码) --pwd=feishu8(发送飞信号码的飞信密码) --to=1357415****(接收飞信的手机号码,必须与发送的号码互为好友) --msg-utf8="$HOSTADDRESS$ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ on $LONGDATETIME$" $CONTACTPAGER$(短信内容,里面定义好了,主机状态的代码)。
'notify-service-by-fetion' command definition定义的是service出现故障时,通过fetion报警/usr/local/feixin/fx/fetion(fetion安装路径) --mobile=1357415****(发送飞信的号码) --pwd=feishu8(发送飞信号码的飞信密码) --to=1357415****(接收飞信的手机号码,必须与发送的号码互为好友)
--msg-utf8="$HOSTADDRESS$ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ on $LONGDATETIME$" $CONTACTPAGER$(短信内容,里面定义好了,主机状态的代码)。
' check_nrpe ' command definition定义的是检测远程主机时需要用到的nrpe插件的位置。
三、检测配置文件
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios 3.0.6
Copyright (c) 1999-2008 Ethan Galstad (https://www.360docs.net/doc/6914673931.html,)
Last Modified: 12-01-2008
License: GPL
Reading configuration data...
Running pre-flight check on configuration data...
Checking services...
Checked 4 services.
Checking hosts...
Checked 1 hosts.
Checking host groups...
Checked 1 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 27 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
不报错的话,可以重启nagios服务了:service nagios restart
顺便重启下apache服务:service httpd restart
重新进去看看吧,哈哈,nagios的安装到此结束喽。
由于本人也是初次接触nagios,很多地方写的不是很好,如果有不对的,请大家多指正。