最新消息:

一个check_mk源码小bug的解决

bug admin 3063浏览 0评论

在线上,我们使用了icinga结合check_mk作为监控系统。
今天,在用cmk -II更新主机的inventory信息时,无论后面跟的是什么主机,都会报告如下错误:

Removing unimplemented check /
Removing unimplemented check oom_adj_for_cron
Removing unimplemented check oom_adj_for_sshd
Traceback (most recent call last):
    File "/usr/share/check_mk/modules/check_mk.py", line 5801, in <module>
        remove_autochecks_of(host, checknames)
    File "/usr/share/check_mk/modules/check_mk.py", line 2907, in remove_autochecks_of
    if splitted[3] not in check_info:
IndexError: list index out of range

在网上搜寻了半天,根本找不到任何有帮助的信息,于是我尝试通过报错中提到的位置对源码进行调试:
修改/usr/share/check_mk/modules/check_mk.py,加入’print splitted’来打印溢出的List,即splitted。

for fn in glob.glob(autochecksdir + "/*.mk"):
    lines = []
    count = 0
    for line in file(fn):
        # hostname and check type can be quoted with ' or with "
        double_quoted = line.replace("'", '"').lstrip()
        if double_quoted.startswith('("'):
            count += 1
            splitted = double_quoted.split('"')
            print splitted
            if splitted[1] != hostname or (checktypes != None and splitted[3] not in checktypes):
            if splitted[3] not in check_info:
                sys.stderr.write('Removing unimplemented check %sn' % splitted[3])
                continue
                lines.append(line)
            else:
                removed += 1
        if len(lines) == 0:

然后再次运行cmk -II,发现如下信息:

...
("iad1-server5", job, 'oom_adj_for_sshd', None)
Removing unimplemented check oom_adj_for_sshd

("iad1-server5", kernel.util, None, kernel_util_default_levels)
Traceback (most recent call last):
    File "/usr/share/check_mk/modules/check_mk.py", line 5803, in <module>
       remove_autochecks_of(host, checknames)
    File "/usr/share/check_mk/modules/check_mk.py", line 2909, in remove_autochecks_of
    if splitted[3] not in check_info:
</module>

可以发现,
(“iad1-server5”, kernel.util, None, kernel_util_default_levels)
根本不能通过单双引号分割为一个长度大于3的List,所以会报溢出的错误:’IndexError: list index out of range’

于是,我加了一个简单的判断,当List的长度大于3时,再执行’Removing unimplemented check’的操作。
# vim /usr/share/check_mk/modules/check_mk.py

for fn in glob.glob(autochecksdir + "/*.mk"):
    lines = []
    count = 0
    for line in file(fn):
        # hostname and check type can be quoted with ' or with "
        double_quoted = line.replace("'", '"').lstrip()
        if double_quoted.startswith('("'):
            count += 1
            splitted = double_quoted.split('"')
            # Sometimes the length of splitted is only 3 due to some items in 'line' do not have quoted marks.
            if len(splitted) > 3:
                if splitted[1] != hostname or (checktypes != None and splitted[3] not in checktypes):
                    if splitted[3] not in check_info:
                        sys.stderr.write('Removing unimplemented check %sn' % splitted[3])
                        continue
                    lines.append(line)
                else:
                    removed += 1
    if len(lines) == 0:

然后,执行 ‘cmk -II’,看到很多的 ‘Removing unimplemented check’ 信息,再次执行就看不到了,应该是因为符合条件的过期记录都已经被删除了的原因。

# cmk -II iad1-server1

...
Removing unimplemented check /
Removing unimplemented check oom_adj_for_cron
Removing unimplemented check oom_adj_for_sshd
Removing unimplemented check crond
Removing unimplemented check sshd
Removing unimplemented check xinetd
cpu.loads         1 new checks
df                2 new checks
kernel.util       1 new checks
lnx_if            1 new checks
local             5 new checks
mem.used          1 new checks
mrpe              4 new checks
postfix_mailq     1 new checks
ps                5 new checks
tcp_conn_stats    1 new checks
uptime            1 new checks

# cmk -II iad1-server1

cpu.loads         1 new checks
df                2 new checks
kernel.util       1 new checks
lnx_if            1 new checks
local             5 new checks
mem.used          1 new checks
mrpe              4 new checks
postfix_mailq     1 new checks
ps                5 new checks
tcp_conn_stats    1 new checks
uptime            1 new checks

转载请注明:爱开源 » 一个check_mk源码小bug的解决

您必须 登录 才能发表评论!