新聞中心
我的機房沒有溫度報警裝置,我用此方法實現(xiàn)對機房溫度的掌控,如果只有一臺報警,則可認為單機故障,如果幾臺同時報警,則可認為機房空調(diào)出現(xiàn)了問題。具體實現(xiàn)方法如下:
十年的江北網(wǎng)站建設(shè)經(jīng)驗,針對設(shè)計、前端、開發(fā)、售后、文案、推廣等六對一服務(wù),響應(yīng)快,48小時及時工作處理。全網(wǎng)營銷推廣的優(yōu)勢是能夠根據(jù)用戶設(shè)備顯示端的尺寸不同,自動調(diào)整江北建站的顯示方式,使網(wǎng)站能夠適用不同顯示終端,在瀏覽器中調(diào)整網(wǎng)站的寬度,無論在任何一種瀏覽器上瀏覽網(wǎng)站,都能展現(xiàn)優(yōu)雅布局與設(shè)計,從而大程度地提升瀏覽體驗。成都創(chuàng)新互聯(lián)從事“江北網(wǎng)站設(shè)計”,“江北網(wǎng)站推廣”以來,每個客戶項目都認真落實執(zhí)行。
環(huán)境:被監(jiān)控機:CentOS 6.3
1、安裝硬件傳感器監(jiān)控軟件 sensors
#yum install lm_sensors*
2、運行sensors-detect進行傳感器檢測
#sensors-detect ##一路回車即可,此步我在虛擬機下報錯,但在物理機上沒有問題
3、運行sensors看是否能讀取數(shù)據(jù),如下像下面這樣表示正常:
[root@rd02 ~]# sensors
coretemp-isa-0000
Adapter: ISA adapter
Core 0: +32.0°C (high = +76.0°C, crit = +100.0°C)
Core 1: +32.0°C (high = +76.0°C, crit = +100.0°C)
4、#vi /usr/local/nagios/libexec/check_cputemp ##粘貼如下#號之間的內(nèi)容
##########################################################
#!/bin/sh
#########check_cputemp###########
#date : May 2011
#Licence GPLv2
#INSTALLATION
#the script need to install lm_sensors
#sensors's output need like below format
#########################################
#coretemp-isa-0000#
#Adapter: ISA adapter#
#Core 0: +27°C (high = +85°C)#
#
#coretemp-isa-0001#
#Adapter: ISA adapter#
#Core 1: +25°C (high = +85°C) #
#########################################
#you can use NRPE to define service in nagios
#check_nrpe!check_cputemp.sh
# Plugin return statements
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
print_help_msg(){
$Echo "Usage: $0 -h to get help."
}
print_full_help_msg(){
$Echo "Usage:"
$Echo "$0 [ -v ] -m sensors -w cpuT -c cpuT"
$Echo "Sepicify the method to use the temperature data sensors."
$Echo "And the corresponding Critical value must greater than Warning value."
$Echo "Example:"
$Echo "${0} -m sensors -w 40 -c 50"
}
print_err_msg(){
$Echo "Error."
print_full_help_msg
}
to_debug(){
if [ "$Debug" = "true" ]; then
$Echo "$*" >> /var/log/check_sys_temperature.log.$$ 2>&1
fi
}
unset LANG
Echo="echo -e"
if [ $# -lt 1 ]; then
print_help_msg
exit 3
else
while getopts :vhm:w:c: OPTION
do
case $OPTION
in
v)
#$Echo "Verbose mode."
Debug=true
;;
m)
method=$OPTARG
;;
w)
WARNING=$OPTARG
;;
c)
CRITICAL=$OPTARG ;;
h)
print_full_help_msg
exit 3
;;
?)
$Echo "Error: Illegal Option."
print_help_msg
exit 3
;;
esac
done
if [ "$method" = "sensors" ]; then
use_sensors="true"
to_debug use_sensors
else
$Echo "Error. Must to sepcify the method to use sensors."
print_full_help_msg
exit 3
fi
to_debug All Values are \" Warning: "$WARNING" and Critical: "$CRITICAL" \".
fi
#########lm_sensors##################
if [ "$use_sensors" = "true" ]; then
sensorsCheckOut=`which sensors 2>&1`
if [ $? -ne 0 ];then
echo $sensorsCheckOut
echo Maybe you need to check your sensors.
exit 3
fi
to_debug Use $sensorsCheckOut to check system temperature
TEMP1=`sensors | head -3 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
TEMP2=`sensors | head -4 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
SUM=$(( $TEMP1 + $TEMP2 ))
TEMP=$(($SUM/2))
if [ -z "$TEMP" ] ; then
$Echo "No Data been get here. Please confirm your ARGS and re-check it with Verbose mode, then to check the log."
exit 3
fi
to_debug temperature data is $TEMP
else
$Echo "Error. Must to sepcify the method to use sensors"
print_full_help_msg
exit 3
fi
######### Comparaison with the warnings and criticals thresholds given by user############
CPU_TEMP=$TEMP
#if [ "$WARNING" != "0" ] || [ "$CRITICAL" != "0" ]; then
if [ "$CPU_TEMP" -gt "$CRITICAL" ] && [ "$CRITICAL" != "0" ]; then
STATE="$STATE_CRITICAL"
STATE_MESSAGE="CRITICAL"
to_debug $STATE , Message is $STATE_MESSAGE
elif [ "$CPU_TEMP" -gt "$WARNING" ] && [ "$WARNING" != "0" ]; then
STATE="$STATE_WARNING"
STATE_MESSAGE="WARNING"
to_debug $STATE , Message is $STATE_MESSAGE
else
STATE="$STATE_OK"
STATE_MESSAGE="OK"
to_debug $STATE , Message is $STATE_MESSAGE
fi
echo "The TEMPERATURE "$STATE_MESSAGE" "-" The CPU's Temperature is "$CPU_TEMP" ℃ !"
exit $STATE
##########################################################
5、賦予上述腳本執(zhí)行權(quán)限:
#chmod +x /usr/local/nagios/libexec/check_cputemp
6、配置nrpe.cfg,添加如下一行:
command[check_cputemp]=/usr/local/nagios/libexec/check_cputemp -m sensors -w 38 -c 45
注意:以上六步均在被監(jiān)控機上完成。
7、在Nagios服務(wù)器配置服務(wù):
define service{
use generic-service
host_name
service_description CPU Temperature
check_command check_nrpe!check_cputemp
}
保存后重啟nagios服務(wù)即可。
最后掛個公司網(wǎng)站的連接:移動支付
本文標題:Nagios監(jiān)控平臺之五:監(jiān)控linux主機的CPU溫度
本文網(wǎng)址:http://fisionsoft.com.cn/article/jhohse.html