Nagios插件开发指南 - Nagios plug-in development guidelines(翻译)

Nagios插件开发指南 - Nagios plug-in development guidelines(翻译)
Nagios插件开发指南 - Nagios plug-in development guidelines(翻译)

Nagios plug-in development guidelines

Nagios Plugins Development Team

Copyright ? 2000 - 2009 Nagios Plugins Development Team

Table of Contents

Preface

1. Development platform requirements开发平台要求

2. Plugin Output for Nagios Nagios插件输出

2.1. Print only one line of text只输出一行文本

2.2. Verbose output详细输出

2.3. Screen Output屏幕输出

2.4. Plugin Return Codes plugin返回代码

2.5. Threshold and ranges阀值和区间

2.6. Performance data性能数据

2.7. Translations翻译

3. System Commands and Auxiliary Files系统命令和附录文件

3.1. Don't execute system commands without specifying their full path不指定完整路径的情况下请不要执行系统命令

3.2. Use spopen() if external commands must be executed使用spopen()执行外部命令3.3. Don't make temp files unless absolutely required如果不是绝对需要不要创建临时文件

3.4. Don't be tricked into following symlinks不要被下面的符号连接欺骗

3.5. Validate all input验证所有输入

4. Perl Plugins Perl插件

5. Runtime Timeouts运行时超时

5.1. Use DEFAULT_SOCKET_TIMEOUT使用DEFAULT_SOCKET_TIMEOUT

5.2. Add alarms to network plugins添加报警到网络插件

6. Plugin Options插件选项

6.1. Option Processing选项处理

6.2. Plugins with more than one type of threshold, or with threshold ranges有多个临界值或有临界值序列的插件

7. Test cases测试用例

7.1. Test cases for plugins插件测试用例

7.2. Testing the C library functions测试C库函数

8. Coding guidelines编码向导

8.1. C coding C编码

8.2. Crediting sources认证源码

8.3. CVS comments CVS提交

8.4. Translations for developers开发者翻译

8.5. Translations for translators翻译人员翻译

9. Submission of new plugins and patches提交新插件或补丁

9.1. Patches补丁

9.2. Contributed plugins分发插件

9.3. New plugins新插件

List of Tables

1. Verbose output levels详细输出级别

2. Plugin Return Codes插件返回代码

3. Example ranges系列实例

4. Command line examples命令行实例

Preface

The purpose of this guidelines is to provide a reference for the plug-in developers and encourage the standarization of the different kind of plug-ins: C, shell, perl, python, etc.这个向导的目的是为插件开发者提供一个手册,并为不同类型插件(如:C, shell, perl, python 等)提供一个标准。

Nagios Plug-in Development Guidelines Copyright (C) 2000-2009 (Nagios Plugins Team) Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

The plugins themselves are copyrighted by their respective authors.插件的版权归开发者个人。

Table of Contents

1. Development platform requirements

2. Plugin Output for Nagios

2.1. Print only one line of text

2.2. Verbose output

2.3. Screen Output

2.4. Plugin Return Codes

2.5. Threshold and ranges

2.6. Performance data

2.7. Translations

3. System Commands and Auxiliary Files

3.1. Don't execute system commands without specifying their full path

3.2. Use spopen() if external commands must be executed

3.3. Don't make temp files unless absolutely required

3.4. Don't be tricked into following symlinks

3.5. Validate all input

4. Perl Plugins

5. Runtime Timeouts

5.1. Use DEFAULT_SOCKET_TIMEOUT

5.2. Add alarms to network plugins

6. Plugin Options

6.1. Option Processing

6.2. Plugins with more than one type of threshold, or with threshold ranges

7. Test cases

7.1. Test cases for plugins

7.2. Testing the C library functions

8. Coding guidelines

8.1. C coding

8.2. Crediting sources

8.3. CVS comments

8.4. Translations for developers

8.5. Translations for translators

9. Submission of new plugins and patches

9.1. Patches

9.2. Contributed plugins

9.3. New plugins

1. Development platform requirements

Nagios plugins are developed to the GNU standard, so any OS which is supported by GNU should run the plugins. While the requirements for compiling the Nagios plugins release are very basic, developing from the Git repository requires additional software to be installed. These are the minimum levels of software required: 下面是最基本软件环境要求:

GNU make 3.79

GNU automake 1.9.2

GNU autoconf 2.59

GNU m4 1.4.2

GNU libtool 1.5

To compile from Git, after you have cloned the repository, run: 从Git编译,从库中复制了代码后,运行以下命令:

tools/setup

./configure

make

make install

2. Plugin Output for Nagios

You should always print something to STDOUT that tells if the service is working or why it is failing. Try to keep the output short - probably less that 80 characters. Remember that you ideally would like the entire output to appear in a pager message, which will get chopped off after a certain length. 你总是应该输出一些东西到STDOUT 来说明服务正在工作或者

它为什么会失败了。试着保持输出的简短—最好少于80个字符。记住最理想的是在一

个页面中输出全部的消息,否则它们达到一定的长度会被裁掉。

As Nagios does not capture stderr output, you should only output to STDOUT and not print to STDERR. Nagios不捕获STDERR输出,你应该只输出到STDOUT而不要打印到STDERR中。

2.1. Print only one line of text 输出一行文本

Nagios will only grab the first line of text from STDOUT when it notifies contacts about potential problems. If you print multiple lines, you're out of luck (though this will be a feature of Nagios 3). Remember, keep your output short and to the point. Nagios从

STDOUT中只抓取第一行文本,当它通知联系人可能的问题时。如果你输出多行,你就不走运了(尽管这可能会是Nagios3的一个特征)。记住,保持你的输出简短和切题。

Output should be in the format: 输出的格式如下:

SERVICE STATUS: Information text

However, note that this is not a requirement of the API, so you cannot depend on this being an accurate reflection of the status of the service - the status should always be determined by

the return code. 注意这不是API的要求,所以你不必依赖于它,来精确地反映服务的状态—状态总是由返回的代码决定的。

2.2. Verbose output 详细输出

Use the -v flag for verbose output. You should allow multiple -v options for additional verbosity, up to a maximum of 3. The standard type of output should be: 使用–v 标记详细输出。你应该为更详细的内容允许多个-v选项。最多可以达到3个。标准的输出类型可以是:

Table 1. Verbose output levels

2.3. Screen Output 屏幕输出

The plug-in should print the diagnostic and just the usage part of the help message. A well written plugin would then have --help as a way to get the verbose help. 插件应该打印诊断信息,它只是有用的帮助信息的一部分。一个编写良好的插件应该有一个-- help 作为一个获得详细帮助的一种。

Code and output should try to respect the 80x25 size of a crt (remember when fixing stuff in the server room!) 代码和输出应该试着遵守80x25 尺寸的显示器

2.4. Plugin Return Codes 插件返回代码

The return codes below are based on the POSIX spec of returning a positive value. Netsaint prior to v0.0.7 supported non-POSIX compliant return code of "-1" for unknown. Nagios supports POSIX return codes by default. 下面的返回代码是基于POSIX细则返回一个正

值。之前的v0.0.7版本支持non-POSIX兼容,返回-1代码来表示一个UNKNOWN状

态。Nagios默认支持POSIX返回代码。

Note: Some plugins will on occasion print on STDOUT that an error occurred and error code is 138 or 255 or some such number. These are usually caused by plugins using system commands and having not enough checks to catch unexpected output. Developers should include a default catch-all for system command output that returns an UNKNOWN return code. 注意:一些插件偶尔在向STDOUT输出,当一个错误发生时,错误代码是138或255或者是一个些其它数字。这些通常是由于使用系统命令并且没有足够的检查来

捕获异常输出。开发着应该为系统命令输出包含一个默认的捕获全部来返回一个

UNKNOWN代码。

Table 2. Plugin Return Codes 插件返回代码

2.5. Threshold and ranges 临界和区间

A range is defined as a start and end point (inclusive) on a numeric scale (possibly negative or positive infinity). 区间被定义来作为一个开始和结束点(包含)数值范围(可能是负数或下无穷大)

A threshold is a range with an alert level (either warning or critical). Use the

set_thresholds(thresholds *, char *, char *) function to set the thresholds. 一个阀值是一个报警级别的区间(warning或critical)。使用函数set_thresholds(thresholds *, char *, char *)来设置阀值。

The theory is that the plugin will do some sort of check which returns back a numerical value, or metric, which is then compared to the warning and critical thresholds. Use the

get_status(double, thresholds *) function to compare the value against the thresholds. 理论上,插件会做一些简短的检查,这些检查会返回一个数据值,或者度量标准,这些值会同warning和critical阀值进行比较。使用函数get_status(double, thresholds *)来与阀值进行比较。

This is the generalised format for ranges: 下面是区间的通常格式:

[@]start:end

Notes:

1.start ≤ end

2.start and ":" is not required if start=0 如果start=0则start 和":"不需要

3.if range is of format "start:" and end is not specified, assume end is infinity

如果区间的格式为"start:"且没有指定end,则假定end为无限大

4.to specify negative infinity, use "~" 指定负无限大使用~

5.alert is raised if metric is outside start and end range (inclusive of endpoints)

如果度量标准超出start或end区间(包括)报警发生

6.if range starts with "@", then alert if inside this range (inclusive of endpoints)

如果区间以@开始,则当度量标准在区间内(包含)时报警发生

Note: Not all plugins are coded to expect ranges in this format yet. There will be some work in providing multiple metrics. 注意:使用这种格式,并非所有的插件都被编码为预期的区间。它需要一些工作来提供多个度量标准。

Table 3. Example ranges 区间实例

Table 4. Command line examples 命令行实例

2.6. Performance data 性能数据

Performance data is defined by Nagios as "everything after the | of the plugin output" - please refer to Nagios documentation for information on capturing this data to logfiles. However, it is the responsibility of the plugin writer to ensure the performance data is in a "Nagios plugins" format. This is the expected format: 性能数据被Nagios定义为“| 之后

的输出部分”—请参考Nagios文档信息来捕获这些数据到日志文件。不管怎样,插件

编写人员的职责是,确保性能数据满足“Nagios插件”格式。下面是预期的格式:

'label'=value[UOM];[warn];[crit];[min];[max]

Notes:

1.space separated list of label/value pairs 空格分隔label/value对

https://www.360docs.net/doc/ae9892665.html,bel can contain any characters 其中label可以包含任意字符

3.the single quotes for the label are optional. Required if spaces, = or ' are in the

label 单引号可选。包含空格,等号或单引号时单引号为必须。

https://www.360docs.net/doc/ae9892665.html,bel length is arbitrary, but ideally the first 19 characters are unique (due to a

limitation in RRD). Be aware of a limitation in the amount of data that NRPE returns to

Nagios 其中label的长度为任意的,但是比较理想的前19个字符应该是不重复的(由于RRD的限制)。须知NRPE返回给Nagios的数据的限制。

5.to specify a quote character, use two single quotes 要指定一个单引号,使用

两个单引号

6.warn, crit, min or max may be null (for example, if the threshold is not

defined or min and max do not apply). Trailing unfilled semicolons can be dropped 其中warn,crit,min,max也可为null(比如临界值没有被定义或者最小最大值没有被应用)。

后面空的分号可以省略

7.min and max are not required if UOM=% 其中min和max是不需要的,如

果UOM为%

8.value, min and max in class [-0-9.]. Must all be the same UOM ??看不懂

9.warn and crit are in the range format (see Section 2.5). Must be the same

UOM

10.UOM (unit of measurement) is one of:

a.no unit specified - assume a number (int or float) of things (eg, users,

processes, load averages)

b.s - seconds (also us, ms)

c.% - percentage

d. B - bytes (also KB, MB, TB)

e. c - a continous counter (such as bytes transmitted on an interface)

It is up to third party programs to convert the Nagios plugins performance data into graphs.

2.7. Translations

If possible, use translation tools for all output to respect the user's language settings.

See Section 8.4 for guidelines for the core plugins.

3. System Commands and Auxiliary Files

3.1. Don't execute system commands without specifying their full path 在没有指定完

整路径的情况下不执行系统命令

Don't use exec(), popen(), etc. to execute external commands without explicity using the full path of the external program. 在没有明确指定一个外部程序的完整路径的情况下,不要使用exec(), popen()等函数来执行外部命令

Doing otherwise makes the plugin vulnerable to hijacking by a trojan horse earlier in the search path. See the main plugin distribution for examples on how this is done. 否则会使得插件很脆弱而很容易被木马劫持。查看主插件分发实例,这些实例说明了这些是怎么发生的。

3.2. Use spopen() if external commands must be executed 使用spopen(),如果外部命令必须执行

If you have to execute external commands from within your plugin and you're writing it in C, use the spopen() function that Karl DeBisschop has written. 如果你必须在你的插件内部

执行外部命令,并且你是在用C写插件,使用Karl DeBisschop 编写的spopen()函数The code for spopen() and spclose() is included with the core plugin distribution. 函数spopen()和spclose()的代码已经在核心插件发布中被包含。

3.3. Don't make temp files unless absolutely required 如果不是绝对需要不要使用临时文件

If temp files are needed, make sure that the plugin will fail cleanly if the file can't be written (e.g., too few file handles, out of disk space, incorrect permissions, etc.) and delete the temp file when processing is complete. 如果需要使用临时文件,确保插件可以干净地失败,

在文件不能被写入的情况下(如:过少的文件处理?,磁盘空间不足,不正确的权限等),

并且保证处理完成后删除临时文件。

3.4. Don't be tricked into following symlinks 不要被符号链接欺骗

If your plugin opens any files, take steps to ensure that you are not following a symlink to another location on the system. 如果你的插件打开了一些文件,请采取措施确保你没有被一个符号链接重定向到系统的其它地址。

3.5. Validate all input 验证所有输入

use routines in utils.c or utils.pm and write more as needed 使用utils.c或utils.pm中的函数,或自己写如果需要

4. Perl Plugins

Perl plugins are coded a little more defensively than other plugins because of embedded Perl. When configured as such, embedded Perl Nagios (ePN) requires stricter use of the some of Perl's features. This section outlines some of the steps needed to use ePN effectively.

1.Do not use BEGIN and END blocks since they will be called only once (when

Nagios starts and shuts down) with Embedded Perl (ePN). In particular, do not use

BEGIN blocks to initialize variables.

2.To use utils.pm, you need to provide a full path to the module in order for it

to work.

e.g.

use lib "/usr/local/nagios/libexec";

use utils qw(...);

3.Perl scripts should be called with "-w"

4.All Perl plugins must compile cleanly under "use strict" - i.e. at least

explicitly package names as in "$main::x" or predeclare every variable.

Explicitly initialize each variable in use. Otherwise with caching enabled, the plugin will not be recompiled each time, and therefore Perl will not reinitialize all the variables. All old variable values will still be in effect.

5.Do not use >DATA< handles (these simply do not compile under ePN).

6.Do not use global variables in named subroutines. This is bad practise

anyway, but with ePN the compiler will report an error " will not stay

shared ..". Values used by subroutines should be passed in the argument list.

7.If writing to a file (perhaps recording performance data) explicitly close close

it. The plugin never calls exit; that is caught by p1.pl, so output streams are never closed.

8.As in Section 5 all plugins need to monitor their runtime, specially if they are

using network resources. Use of the alarm is recommended noting that some Perl

modules (eg LWP) manage timers, so that an alarm set by a plugin using such a module is overwritten by the module. (workarounds are cunning (TM) or using the module timer) Plugins may import a default time out ($TIMEOUT) from utils.pm.

9.Perl plugins should import %ERRORS from utils.pm and then "exit

$ERRORS{'OK'}" rather than "exit 0"

5. Runtime Timeouts

Plugins have a very limited runtime - typically 10 sec. As a result, it is very important for plugins to maintain internal code to exit if runtime exceeds a threshold. 插件有一个特有的限制的运行时间—代表性地是10秒。因此,保持当内部代码运行时间超出一个阀值时退出非常重要。

All plugins should timeout gracefully, not just networking plugins. For instance, df may lock if you have automounted drives and your network fails - but on first glance, who'd think df could lock up like that. Plus, it should just be more error resistant to be able to time out rather than consume resources. 所有插件应该优雅地超时,并不仅仅是联网插件。比如,

df命令会锁定如果你自动挂载驱动并且你的网络失败了—但是第一次扫视,你最好认为df会那样锁定。另外,

5.1. Use DEFAULT_SOCKET_TIMEOUT

All network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout

5.2. Add alarms to network plugins

If you write a plugin which communicates with another networked host, you should make sure to set an alarm() in your code that prevents the plugin from hanging due to abnormal socket closures, etc. Nagios takes steps to protect itself against unruly plugins that timeout, but any plugins you create should be well behaved on their own. 如果你写了一个插件,这

个插件会同其它网络主机交互,你应该确保设置一个alarm()在你的代码中,来防止插件因为不正常的socket关闭而挂起等。Nagios采取措施来保护它自己来对抗不守规矩的插件,那就是超时。但是你创建的所有插件应该是运自身行良好的。

6. Plugin Options

A well written plugin should have --help as a way to get verbose help. Code and output should try to respect the 80x25 size of a crt (remember when fixing stuff in the server room!)

6.1. Option Processing

For plugins written in C, we recommend the C standard getopt library for short options. Getopt_long is always available.

For plugins written in Perl, we recommend Getopt::Long module.

Positional arguments are strongly discouraged.

There are a few reserved options that should not be used for other purposes:

-V version (--version)

-h help (--help)

-t timeout (--timeout)

-w warning threshold (--warning)

-c critical threshold (--critical)

-H hostname (--hostname)

-v verbose (--verbose)

In addition to the reserved options above, some other standard options are:

-C SNMP community (--community)

-a authentication password (--authentication)

-l login name (--logname)

-p port or password (--port or --passwd/--password)monitors operational

-u url or username (--url or --username)

Look at check_pgsql and check_procs to see how I currently think this can work. Standard options are:

The option -V or --version should be present in all plugins. For C plugins it should result in a call to print_revision, a function in utils.c which takes two character arguments, the command name and the plugin revision.

The -? option, or any other unparsable set of options, should print out a short usage statement. Character width should be 80 and less and no more that 23 lines should be printed (it should display cleanly on a dumb terminal in a server room).

The option -h or --help should be present in all plugins. In C plugins, it should result in a call to print_help (or equivalent). The function print_help should call print_revision, then

print_usage, then should provide detailed help. Help text should fit on an 80-character width display, but may run as many lines as needed.

The option -v or --verbose should be present in all plugins. The user should be allowed to specify -v multiple times to increase the verbosity level, as described in Table 1.

6.2. Plugins with more than one type of threshold, or with threshold ranges

Old style was to do things like -ct for critical time and -cv for critical value. That goes out the window with POSIX getopt. The allowable alternatives are:

1.long options like -critical-time (or -ct and -cv, I suppose).

2.repeated options like `check_load -w 10 -w 6 -w 4 -c 16 -c 10 -c 10`

3.for brevity, the above can be expressed as `check_load -w 10,6,4 -c 16,10,10`

4.ranges are expressed with colons as in `check_procs -C httpd -w 1:20 -c 1:30`

which will warn above 20 instances, and critical at 0 and above 30

5.lists are expressed with commas, so Jacob's check_nmap uses constructs like

'-p 1000,1010,1050:1060,2000'

6.If possible when writing lists, use tokens to make the list easy to remember

and non-order dependent - so check_disk uses '-c 10000,10%' so that it is clear which is the precentage and which is the KB values (note that due to my own lack of foresight, that used to be '-c 10000:10%' but such constructs should all be changed for consistency, though providing reverse compatibility is fairly easy).

As always, comments are welcome - making this consistent without a host of long options was quite a hassle, and I would suspect that there are flaws in this strategy.

7. Test cases

Tests are the best way of knowing if the plugins work as expected. Please create and update test cases where possible.

To run a test, from the top level directory, run "make test". This will run all the current tests and report an overall success rate.

See the Nagios Plugins Tinderbox server for the daily test results.

7.1. Test cases for plugins

These use perl's Test::More. To do a one time test, run "cd plugins && perl t/check_disk.t".

There will somtimes be failures seen in this output which are known failures that need to be fixed. As long as the return code is 0, it will be reported as "test pass". (If you have a fix so that the specific test passes, that will be gratefully received!)

If you want a summary test, run: "cd plugins && prove t/check_disk.t". This runs the test in a summary format.

For a good and amusing tutorial on using Test::More, see this link

7.2. Testing the C library functions

We use the libtap library, which gives perl's TAP (Test Anything Protocol) output. This is used by the FreeBSD team for their regression testing.

To run tests using the libtap library, download the latest tar ball and extract. There is a problem with tap-1.01 where pthread support doesn't appear to work properly on

non-FreeBSD systems. Install with 'CPPFLAGS="-UHAVE_LIBPTHREAD" ./configure && make && make check && make install'.

When you run Nagios Plugins' configure, it will look for the tap library and will automatically setup the tests. Run "make test" to run all the tests.

8. Coding guidelines

See GNU Coding standards for general guidelines.

8.1. C coding

Variables should be declared at the beginning of code blocks and not inline because of portability with older compilers.

You should use /* */ for comments and not // as some compilers do not handle the latter form.

You should also avoid using the type "bool" and its values "true" and "false". Instead use the "int" type and the plugins' own "TRUE"/"FALSE" values to keep the code uniformly.

8.2. Crediting sources

If you have copied a routine from another source, make sure the licence from your source allows this. Add a comment referencing the ACKNOWLEDGEMENTS file, where you can put more detail about the source.

For contributed code, do not add any named credits in the source code - contributors should be added into the THANKS.in file instead.

8.3. CVS comments

If the change is due to a contribution, please quote the contributor's name and, if applicable, add the SourceForge Tracker number. Don't forget to update the THANKS.in file.

If you have a change that is useful for noting in the next release, please update the NEWS file.

All commit comments will be written to a ChangeLog at release time.

8.4. Translations for developers

To make the job easier for translators, please follow these guidelines:

1.Before creating new strings, check the po/nagios-plugins.pot file to see if a

similar string already exists

2.For help texts, break into individual options so that these can be reused

between plugins

3.Try to avoid linefeeds unless you are working on a block of text

4.Short help is not translated

5.Long help has options in English language, but text translated

6."Copyright" kept in English

7.Copyright holder names kept in original text

8.Debugging output does not need to be translated

8.5. Translations for translators

To create an up to date list of translatable strings, run: tools/gen_locale.sh

9. Submission of new plugins and patches 提交新插件和补丁

9.1. Patches

If you have a bug patch, please supply a unified or context diff against the version you are using. For new features, please supply a diff against the Git "master" branch. 如果你有一个补丁,请提供一个与你当前使用的版本一致的,或者提供与你当前使用的版本的差异。Patches should be submitted via SourceForge's tracker system for Nagiosplug patches and be announced to the nagiosplug-devel mailing list.

Submission of a patch implies that the submmitter acknowledges that they are the author of the code (or have permission from the author to release the code) and agree that the code can be released under the GPL. The copyright for the changes will then revert to the Nagios Plugin Development Team - this is required so that any copyright infringements can be investigated quickly without contacting a huge list of copyright holders. Credit will always be given for any patches through a THANKS file in the distribution.

9.2. Contributed plugins

Plugins that have been contributed to the project and distributed with the Nagios Plugin files are held in the contrib/ directory and are not installed by default. These plugins are not officially supported by the team. The current policy is that these plugins should be owned and maintained by the original contributor, preferably hosted on NagiosExchange.

If patches or bugs are raised to an contributed plugin, we will start communications with the original contributor, but seek to remove the plugin from our distribution.

The aim is to distribute only code that the Nagios Plugin team are responsible for.

9.3. New plugins

If you would like others to use your plugins, please add it to the official 3rd party plugin repository, NagiosExchange.

We are not accepting requests for inclusion of plugins into our distribution at the moment, but when we do, these are the minimum requirements:

1.Include copyright and license information in all files. Copyright must be

solely granted to the Nagios Plugin Development Team

2.The standard command options are supported (--help, --version, --timeout,

--warning, --critical)

3.It is determined to be not redundant (for instance, we would not add a new

version of check_disk just because someone had provide a plugin that had perf checking - we would incorporate the features into an exisiting plugin)

4.One of the developers has had the time to audit the code and declare it ready

for core

5.It should also follow code format guidelines, and use functions from utils

(perl or c or sh) rather than using its own

6.Includes patches to configure.in if required (via the EXTRAS list if it will

only work on some platforms)

相关主题
相关文档
最新文档