MySQL 有关MHA搭建与切换的几个错误log汇总

1：masterha_check_repl 副本集方面报错 replicates is not defined in the configuration file!

具体信息如下：

# /usr/local/bin/masterha_check_repl --conf=/etc/mha/app1.cnf
thu nov 21 15:33:15 2018 - [warning] global configuration file /etc/masterha_default.cnf not found. skipping.
thu nov 21 15:33:15 2018 - [info] reading application default configuration from /etc/mha/app1.cnf..
thu nov 21 15:33:15 2018 - [info] reading server configuration from /etc/mha/app1.cnf..
thu nov 21 15:33:15 2018 - [info] mha::mastermonitor version 0.56.
thu nov 21 15:33:16 2018- [error][/usr/local/share/perl5/mha/servermanager.pm, ln671] master 179.179.19.179:3306 from which slave 179.179.19.180(179.179.19.180:3306) replicates is not defined in the configuration file!
thu nov 21 15:33:16 2018 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln424] error happened on checking configurations. at /usr/local/share/perl5/mha/mastermonitor.pm line 326.
thu nov 21 15:33:16 2018 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln523] error happened on monitoring servers.
thu nov 21 15:33:16 2018 - [info] got exit code 1 (not master dead).

mysql replication health is not ok!

分析：mha 漂移过后，我们知道配置信息中主节点的信息就不在了，我们需要及时维护，否则/usr/local/bin/masterha_check_repl –conf=/etc/mha/xxx.cnf 检查副本集状态报错。

2. masterha_master_switch 在线切换方面报错 we should not start online master switch when one of connections are running long updates on the current master

具体信息如下：

# /usr/local/bin/masterha_master_switch --master_state=alive --conf=/etc/mha/app1.cnf

it is better to execute flush no_write_to_binlog tables on the master before switching. is it ok to execute on 179.179.19.184(179.179.19.184:3306)? (yes/no): y

tue nov 19 17:19:09 2018 - [info] executing flush no_write_to_binlog tables. this may take long time..
tue nov 19 17:19:09 2018 - [info] ok.
tue nov 19 17:19:09 2018 - [info] checking mha is not monitoring or doing failover..
tue nov 19 17:19:09 2018 - [info] checking replication health on 179.179.19.185..
tue nov 19 17:19:09 2018 - [info] ok.
tue nov 19 17:19:09 2018 - [error][/usr/local/share/perl5/mha/masterrotate.pm, ln161] we should not start online master switch when one of connections are running long updates on the current master(179.179.19.184(179.179.19.184:3306)). currently 1 update thread(s) are running.
details:
{'time' => '12815','db' => undef,'id' => '1','user' => 'event_scheduler','state' => 'waiting on empty queue','command' => 'daemon','info' => undef,'host' => 'localhost'}
tue nov 19 17:19:09 2018 - [error][/usr/local/share/perl5/mha/managerutil.pm, ln177] got error: at /usr/local/bin/masterha_master_switch line 53.

分析：set global event_scheduler=off; 主从都要关闭

3. masterha_master_switch 在线切换方面报错 got error: dbi …..failed: access denied for user

# /usr/local/bin/masterha_master_switch --master_state=alive --conf=/etc/mha/app1.cnf

starting master switch from 179.179.19.185(179.179.19:3306) to 179.179.19.184(179.179.19.184:3306)? (yes/no): yes

tue nov 19 18:52:04 2018 - [info] checking whether 179.179.19.184(179.179.19.184:3306) is ok for the new master..
tue nov 19 18:52:04 2018 - [info] ok.
tue nov 19 18:52:04 2018 - [info] ** phase 1: configuration check phase completed.
tue nov 19 18:52:04 2018 - [info] 
tue nov 19 18:52:04 2018 - [info] * phase 2: rejecting updates phase..
tue nov 19 18:52:04 2018 - [info] 
tue nov 19 18:52:04 2018 - [info] executing master ip online change script to disable write on the current master:
tue nov 19 18:52:04 2018 - [info]  /usr/local/bin/master_ip_online_change_appuanalysis --command=stop --orig_master_host=179.179.19.185 --orig_master_ip=179.179.19.185 --orig_master_port=3306--orig_master_user='weixinlx391p_xldbmha' --orig_master_password='weixinlx391p_xldbmha\)qlk' --new_master_host=179.179.19.184 --new_master_ip=179.179.19.184 --new_master_port=55988 --new_master_user='us_mha' --new_master_password='weixinlx391p_xldbmha\)qlk' --orig_master_ssh_user=root --new_master_ssh_user=root 
got error: dbi connect(';host=179.179.19.184;port=3306;mysql_connect_timeout=4','weixinlx391p_xldbmha',...) failed: access denied for user 'weixinlx391p_xldbmha'@'179.179.19.166' (using password: yes) at /usr/local/share/perl5/mha/dbhelper.pm line 205.
 at /usr/local/bin/master_ip_online_change_app1 line 119.

tue nov 19 18:52:04 2018 - [error][/usr/local/share/perl5/mha/managerutil.pm, ln177] got error: at /usr/local/bin/masterha_master_switch line 53.

分析：账号密码有需要转移字符的。app1.cnf 文件中user账号相应的密码 password 不能有待转移的字符，例如本例中的’)’，但是账号 repl_user 相应的密码repl_password 没有此限制。

4.如果使用的是xtrabackup，注意从节点会把event还原上去，可能会造成数据不一致，同步失败的问题。

如果主节点有event，需要手动关闭从节点的event。例如，主节点有归档删除数据的event，从节点需要关闭，否则报错。类似如下错误：

could not execute delete_rows event on table ????db.*****table; can't find record in '*****', error_code: 1032; handler error ha_err_key_not_found; the event's master log first, end_log_pos xxxxxxx

5 .gtid 模式转换为传统模式后，mha 机制下数据库主从检查报错。

检查的命令：

/usr/local/bin/masterha_check_repl --conf=/etc/mha/qqweixinordb.cnf

主要的报错信息

can't exec "mysqlbinlog": no such file or directory at /usr/local/share/perl5/mha/binlogmanager.pm line 106.
mysqlbinlog version command failed with rc 1:0, please verify path, ld_library_path, and client options
 at /usr/local/bin/apply_diff_relay_logs line 493.
fri aug 28 04:38:22 2019 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln205] slaves settings check failed!
fri aug 28 04:38:22 2019 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln413] slave configuration failed.
fri aug 28 04:38:22 2019 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln424] error happened on checking configurations.  at /usr/local/bin/masterha_check_repl line 48.
fri aug 28 04:38:22 2019 - [error][/usr/local/share/perl5/mha/mastermonitor.pm, ln523] error happened on monitoring servers.
fri aug 28 04:38:22 2019 - [info] got exit code 1 (not master dead).
 
mysql replication health is not ok!

解决方案–在每个db节点执行以下命令

ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
 
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql

再次检查，报错信息消失，ok。

6.root 账号密码过期

以root账号设置的ssh免密登陆，而ssh有过期限制，则mha ssh检查时报错:

/usr/local/bin/masterha_check_ssh --conf=/etc/mha/qqorder.cnf
thu nov 5 10:09:09 2018 - [warning] global configuration file /etc/masterha_default.cnf not found. skipping.
thu nov 5 10:09:09 2018 - [info] reading application default configuration from /etc/mha/pms20epime.cnf..
thu nov 5 10:09:09 2018 - [info] reading server configuration from /etc/mha/pms20epime.cnf..
thu nov 5 10:09:09 2018 - [info] starting ssh connection tests..
thu nov 5 10:09:09 2018 - [error][/usr/local/share/perl5/mha/sshcheck.pm, ln63]
thu nov 5 10:09:09 2018 - [debug] connecting via ssh from root@172.181.191.191(172.181.191.191:22) to root@172.181.191.192(172.181.191.192:22)..

warning: your password has expired.
password change required but no tty available.
thu nov 5 10:09:09 2018 - [error][/usr/local/share/perl5/mha/sshcheck.pm, ln111] ssh connection from root@172.181.191.191(172.181.191.191:22) to root@172.181.191.192(172.181.191.192:22) failed!
thu nov 5 10:09:10 2018 - [error][/usr/local/share/perl5/mha/sshcheck.pm, ln63]
thu nov 5 10:09:09 2018 - [debug] connecting via ssh from root@172.181.191.192(172.181.191.192:22) to root@172.181.191.191(172.181.191.191:22)..

warning: your password has expired.
password change required but no tty available.
thu nov 5 10:09:10 2018 - [error][/usr/local/share/perl5/mha/sshcheck.pm, ln111] ssh connection from root@172.181.191.192(172.181.191.192:22) to root@172.181.191.191(172.181.191.191:22) failed!
ssh configuration check failed!
 at /usr/local/bin/masterha_check_ssh line 44.

另外一种表现，就是第二次执行账号切换时报错 sudo su –

解决方案在root账号下，执行以下命令:

chage -m 99999 root

以上就是mysql 有关mha搭建与切换的几个错误log汇总的详细内容，更多关于mysql mha搭建与切换的资料请关注www.887551.com其它相关文章！