侧边栏壁纸
博主头像
平凡的运维之路博主等级

行动起来,活在当下

  • 累计撰写 49 篇文章
  • 累计创建 25 个标签
  • 累计收到 3 条评论

目 录CONTENT

文章目录

Python检查zookeeper节点信息状态

平凡的运维之路
2024-08-01 / 0 评论 / 0 点赞 / 87 阅读 / 18244 字

Python检查zookeeper节点信息状态

需求

  • 目前项目核心模块使用zookeeper做分布式锁,但是zookeeper的节点信息状态没有监控,所以需要监控zookeeper的节点信息状态
  • 监控zookeeper的节点信息状态,如果节点信息状态异常不存在,则输出错误关键字或者报警等

实现

  • 利用python的zookeeper模块,获取zookeeper的节点信息状态
  • 利用python的配合错误关键字,将zookeeper的节点信息异常状态推送网管告警

缺陷

  • 使用DataWatch只监控节点信息状态,如果该节点信息不存在,这时其他节点cti1信息变更后,会通知到其他不存在节点cti5,如下信息
#修改节点信息时
[zk: localhost:2182(CONNECTED) 28] set /ctimanager/bj-cti1  {"agent":2,"test":1,"a1":2ok23121111111}
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:76 INFO zookeeper login host: 127.0.0.1:2182
2023-12-20 19:26:35,660 140538316216128 check_zookeeper_node.py:81 INFO use user login auth conn zookeeper 
2023-12-20 19:26:35,661 140538316216128 check_zookeeper_node.py:83 INFO zookeeper user info: digest admin:123456
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2}' 状态正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:35,663 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:35,664 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:35,665 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:35,666 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti5 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti5 正常
2023-12-20 19:26:45,674 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:45,676 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:45,677 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在

实现代码

  • 获取zookeeper的节点信息状态
#!/usr/bin/python3 
# -*- coding:utf-8 -*-
###############################################################
#针对平台监控注册zookeeper节点模块宕机后,节点丢失异常时,输出告警##
############2023年12月20日16点02分##############################
###############################################################

from kazoo.client import KazooClient
from kazoo.exceptions import KazooException
from logging.handlers import RotatingFileHandler
import time,configparser,os,logging

def watch_nodes():
    for node_path in node_paths:
        try:
            @zk.DataWatch(node_path)
            def on_data_change(data, stat):
                if data is None:
                    logger.error(f"从zookeeper获取节点 {node_path} 不存在")
                else:
                    logger.debug(f"从zookeeper节点 {node_path} 的数据获取信息: {data} 状态正常")
                    logger.info(f"从zookeeper获取节点 {node_path} 正常")
        except KazooException as e:
            logger.error(f"从zookeeper监控节点 {node_path} 时发生异常:{e}")

def logger_func(LogLevel):
  # 创建日志记录器
  logger = logging.getLogger("my_logger")
  if "DEBUG" == LogLevel:
    logger.setLevel(logging.DEBUG)
  if "INFO" == LogLevel:
    logger.setLevel(logging.INFO)
  if "WARNING" == LogLevel:
    logger.setLevel(logging.WARNING)
  if "ERROR" == LogLevel:
    logger.setLevel(logging.ERROR)
  # 创建RotatingFileHandler对象
  handler = RotatingFileHandler(log_file, maxBytes=max_log_size, backupCount=backup_count)

  # 定义日志格式
  formatter = logging.Formatter("%(asctime)s %(thread)d %(filename)s:%(lineno)d %(levelname)s %(message)s")
  handler.setFormatter(formatter)

  # 将处理程序添加到日志记录器
  logger.addHandler(handler)
  return logger

if __name__ == "__main__":
    for dirpath in os.popen("pwd"):
        dirpath = dirpath.strip('\n')
    cfgpath = os.path.join(dirpath, "cfg/config.ini")
    conf = configparser.ConfigParser()
        # conf = ConfigParser.ConfigParser()
    print("config file ---> ",cfgpath)
    # conf.read(cfgpath, encoding='UTF-8')
    conf.read(cfgpath)
    GetLogDir = conf.get("Base","LogDir")
    LogLevel = conf.get("Base","LogLevel")
    CheckintervalDate = conf.get("Base","CheckintervalDate")
    max_log_size = int(conf.get("Base","max_log_size"))
    backup_count = int(conf.get("Base","backup_count"))
    zk_hosts = conf.get("zookeeper","zookeeper_host")
    is_auth = conf.getboolean("zookeeper","is_auth")
    username = conf.get("zookeeper","zookeeper_user")
    password = conf.get("zookeeper","zookeeper_passwd")
    zookeeper_node = conf.get("zookeeper","zookeeper_node")

    log_file =  "./log/watch_zoo.log"
    logger = logger_func(LogLevel)

    # 要监控的节点路径列表
    node_paths = zookeeper_node.split(",")
    logger.info("get config file  zookeeper node list: " + str(node_paths))

    # 连接Zookeeper
    logger.info("zookeeper login host: " + zk_hosts) 
    zk = KazooClient(hosts=zk_hosts)
    try:
        zk.start()
        if True == is_auth:
            logger.info("use user login auth conn zookeeper ")
            zk.add_auth(scheme='digest',credential=username + ":" + password)
            logger.info("zookeeper user info: digest " + str(username) + ":" + password )
        else:
            logger.info("not use user auth login zookeeper")

        while True:
            watch_nodes()
            time.sleep(int(CheckintervalDate))

    except KazooException as e:
        if "Connection refused" in str(e):
            logger.error("conn zookeeper error: " + str(e) + " exit!!!")
            os.exit(110)
        else:
            logger.error("conn other zookeeper error: " + str(e) + " exit!!!" )
            os.exit(110)
    finally:
        zk.stop()
  • 配置文件说明
[Base]
#执行运行间隔休眠时间单位是s
CheckintervalDate=10
#设置日志级别: DEBUG、INFO、WARNING、ERROR
LogLevel=INFO
#日志目录
LogDir=./log
#日志文件大小100MB
max_log_size = 1104857600
#日志文件最大备份次数
backup_count = 10

[zookeeper]
#zookeeper 登录host ip
zookeeper_host = 127.0.0.1:2182
#认证用户
zookeeper_user = admin
#认证密码
zookeeper_passwd = 123456
#是否启用zookeeper密码链接
is_auth = True
#配置对应节点信息,最好对应上,cti上报多个cti,这里就配置多少个节点信息
zookeeper_node = /ctimanager/bj-cti1,/ctimanager/bj-cti2,/ctimanager/bj-cti3,/ctimanager/bj-cti4,/ctimanager/bj-cti5

脚本运行输出

  • 脚本运行
[devops@my-dev watch_zookeeper]$ ./check_zookeeper_node.py 
config file --->  /home/devops/Python/ABC/watch_zookeeper/cfg/config.ini
2023-12-20 19:20:18,946 139836511770432 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:76 INFO zookeeper login host: 127.0.0.1:2182
2023-12-20 19:26:35,660 140538316216128 check_zookeeper_node.py:81 INFO use user login auth conn zookeeper 
2023-12-20 19:26:35,661 140538316216128 check_zookeeper_node.py:83 INFO zookeeper user info: digest admin:123456
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2}' 状态正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:35,663 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:35,664 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:35,665 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:35,666 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti5 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti5 正常
2023-12-20 19:26:45,674 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:45,676 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:45,677 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:45,678 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:55,685 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:55,685 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:55,686 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:55,686 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:55,687 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:55,688 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:55,689 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
0

评论区