mysql replication检查(一)

By pengyao - Last updated: 星期三, 三月 3, 2010 - Save & Share - Leave a Comment

最近接收一批Mysql数据库,刚开始数量较少,就每天登陆服务器执行”show slave status”来查询mysql replication状态,随着服务器数量越来越多,传统的土方法花费的成本越来越多,看来需要写一个程序来完成这些操作了.

需求如下:
(1) 每小时执行一次
(2) slave服务器信息写入数据库,程序能够直接连接数据库获取需要查询的服务器
(3) 邮件提醒功能,发现replication状态异常,发送邮件提醒
(4) 数据库信息及邮件信息在一个配置文件中指定,调整时不需要程序做任何修改
(5) 人性化结果输出

对应需求实现方法:
语言: python
(1) 使用linux crond服务
(2) 安装MysqlDB模块,直接从库中读取信息
(3) 过滤结果,如果发现结果异常,调用smtplib模块进行邮件提醒
(4) 调用ConfigParser模块实现
(5) 由于不懂cgi,html语法又知之甚少,python直接按照html语法将结果汇总为一个html页面,然后通过浏览器显示

具体实施:
(1) 安装个webserver,刚好手头有一台lighttpd服务器,在上边部署监控脚本,lighttpd htdocs路径为: /usr/local/lighttpd/htdocs/,python将文件生成slave_check.html页面
(2) 连接slave信息存储数据库,创建mon数据库,执行slave_table.sql

CREATE TABLE `slave_check` (
           `id` int NOT NULL auto_increment,
           `name` varchar(32) default NULL,
           `ip` varchar(32) default NULL,
           `port` int(5) default '3306',
           `db_name` varchar(32) default NULL,
           `mail_check` varchar(16) default NULL,
           `desc` varchar(16) default NULL,
           PRIMARY KEY  (`name`)
         ) ENGINE=MyISAM DEFAULT CHARSET=utf8

(3) slave_check添加slave服务器信息及权限

insert into slave_check(name,ip,port,db_name,mail_check) values("test","192.168.0.3",3306,"test","0");
grant select on mon.slave_check to slave_check@"127.0.0.1" identified by "123456"

(4) 在slave服务器上添加授权

grant REPLICATION CLIENT on *.* to repl_check@"192.168.0.2" identified by "123456";

(5) 在lighttpd服务器上创建config.ini文件
[DB]
Host=127.0.0.1
Port=3306
User=slave_check
Pass=123456
Db=mon
Table=slave_check

[MAIL]
Smtp_host=mail.pengyao.org
From=mon@pengyao.org
To=mon@pengyao.org

[SLAVE]
User=repl_check
Pass=123456
(6)脚本部署
安装python,安装MysqlDB,部署slave_check.py脚本

#!/bin/env python

import time
import ConfigParser,MySQLdb,email.MIMEText,smtplib

def _config(file,section,option):
        cf=ConfigParser.ConfigParser()
        cf.read(file)
        return cf.get(section,option)

def _mysql_query(host,port,user,passwd,db,query):
        db_con=MySQLdb.connect(host=host,user=user,passwd=passwd,db=db,port=int(port),use_unicode=True,charset="UTF8")
        cur=db_con.cursor()
        cur.execute(query)
        result=cur.fetchall()
        db_con.close()
        return result

config_file="/root/config.ini"
db_section="DB"
mail_section="MAIL"
slave_section="SLAVE"
html_file="/usr/local/lighttpd/htdocs/slave_check.html"

##config db options
config_db_host=_config(config_file,db_section,"Host")
config_db_port=_config(config_file,db_section,"Port")
config_db_user=_config(config_file,db_section,"User")
config_db_pass=_config(config_file,db_section,"Pass")
config_db_db=_config(config_file,db_section,"Db")
config_db_table=_config(config_file,db_section,"Table")

##config mail options
config_mail_smtphost=_config(config_file,mail_section,"Smtp_host")
config_mail_from=_config(config_file,mail_section,"From")
config_mail_to=_config(config_file,mail_section,"To")

##config slave options
config_slave_user=_config(config_file,slave_section,"User")
config_slave_pass=_config(config_file,slave_section,"Pass")

##query zone info SQL
zone_query="select id,name,ip,port,db_name,mail_check from "+config_db_table

#query slave status info SQL
slave_status="show slave status"

##query all zone info
zone_info=_mysql_query(config_db_host,config_db_port,config_db_user,config_db_pass,config_db_db,zone_query)

##html header
zone_html="""

mysql replaction check

""" for eachZone in zone_info: (slave_id,slave_name,slave_ip,slave_port,slave_db,slave_mail_check)=eachZone current_time=time.strftime("%Y-%m-%d %H:%M:%S") mail_message="" slave_result=_mysql_query(slave_ip,slave_port,config_slave_user,config_slave_pass,"test",slave_status) ((Slave_IO_State,Master_Host,Master_User,Master_Port,Connect_Retry,Master_Log_File,Read_Master_Log_Pos,\ Relay_Log_File,Relay_Log_Pos,Relay_Master_Log_File,Slave_IO_Running,Slave_SQL_Running,Replicate_Do_DB,Replicate_Ignore_DB,\ Replicate_Do_Table,Replicate_Ignore_Table,Replicate_Wild_Do_Table,Replicate_Wild_Ignore_Table,Last_Errno,Last_Error,\ Skip_Counter,Exec_Master_Log_Pos,Relay_Log_Space,Until_Condition,Until_Log_File,Until_Log_Pos,Master_SSL_Allowed Master_SSL_CA_File,Master_SSL_CA_Path,Master_SSL_Cert,Master_SSL_Cipher,Master_SSL_Key,Seconds_Behind_Master),)=slave_result if Slave_IO_Running !="Yes": mail_message+="%s ----------> Slave_IO_Running:%s\n" %(slave_ip,Slave_IO_Running) Slave_IO_Running=""+Slave_IO_Running+"" if Slave_SQL_Running !="Yes": mail_message+="%s ----------> Slave_SQL_Running:%s\n" %(slave_ip,Slave_SQL_Running) Slave_SQL_Running=""+Slave_SQL_Running+"" if Last_Error=="": Last_Error="Null" else: mail_message+="%s ----------> Last_Error:%s\n" %(slave_ip,Last_Error) zone_html+="""
id zone_name master_ip slave_ip io_running sql_running last_error behind_master(s) query_time
%s %s %s %s %s %s %s %s %s
""" %(slave_id,slave_name,Master_Host,slave_ip,Slave_IO_Running,Slave_SQL_Running,Last_Error,Seconds_Behind_Master,current_time) #send mail if mail_message !="": msg=email.MIMEText.MIMEText(mail_message) msg['To']=config_mail_to msg['From']=config_mail_from msg['Subject']=slave_name+" "+slave_ip+" mysql replication failed" smtp_con=smtplib.SMTP() smtp_con.connect(config_mail_smtphost) smtp_con.sendmail(config_mail_from,config_mail_to,msg.as_string()) smtp_con.close() zone_html+="" ##write slave status html slave_html=open(html_file,"w") slave_html.write(zone_html) slave_html.close()

(7) 添加cron

crontab -e
    0 * * * * /root/slave_check.py

完工,一小时扫描下slave服务器replication信息,如果有异常,邮件报警,也可直接在浏览器中输入http://lighttp_server/slave_check.html查询最近一次的查询信息

后记:
已经将程序托管到google code上去了,使用Apache2.0协议发布,可以直接访问: https://code.google.com/p/mysqlreplcheck/ 获取需要的信息

Posted in Gnu/linux, Program • Tags: , , , , Top Of Page

Write a comment