If you are working on a cluster, after a sudden failover you face the issue that Mysql is unable to start.
In this article, we will show you how to solve the issue.
Environment
- Red Hat Enterprise Linux Server 6 (with the High Availability and Resilient Storage Add Ons)
- Red Hat Enterprise Linux Server 7
- Red Hat Enterprise Linux Server 8
- MySQL
Issue
- Permission of mount point of MySQL data directory gets changed.
-
Mysql resourceis unable to start in the other node after failover from the first node.Sep 2 16:23:57 node A rgmanager[144383]: [fs] mounting /dev/dm-4 on /var/lib/mysql Sep 2 16:23:57 node A kernel: EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: Sep 2 16:23:57 node A rgmanager[144527]: [mysql] Checking Existence Of File /var/run/cluster/mysql/mysql:DB_name.pid [mysql:DB_name] > Failed Sep 2 16:23:57 node A rgmanager[144549]: [mysql] Monitoring Service mysql:DB_name > Service Is Not Running Sep 2 16:23:57 node A rgmanager[144571]: [mysql] Starting Service mysql:DB_name Sep 2 16:24:27 node A rgmanager[145724]: [mysql] Starting Service mysql:DB_name > Failed - Timeout Error Sep 2 16:24:27 node A rgmanager[4142]: start on mysql "DB_name" returned 1 (generic error)
Resolution
- Change the UID and GID of the MySQL user and keep it the same across all the cluster nodes.
- Also, check that the UID and GID which you are going to assign for the MySQL user should not be possessed by any other user. If any then kindly change the UID and GID before changing the MySQL credentials.
- Once after changing the credentials now try to failover the service from one node to another and the service should start without any issue.
Root Cause
-
The User and Group credentials of Mysql are not the same in all the cluster nodes.
On node A # ls -ld /var/lib/mysql/ drwxr-xr-x 5 495 490 4096 Sep 11 00:21 /var/lib/mysql/ # grep -i mysql /etc/passwd mysql:x:496:491:MySQL server:/var/lib/mysql:/bin/bashOn node B
# ls -ld /var/lib/mysql/
drwxr-xr-x 5 496 491 4096 Sep 11 00:22 /var/lib/mysql/
# grep -i mysql /etc/passwd
mysql:x:495:490:MySQL server:/var/lib/mysql:/bin/bash -
As it can be seen from the above output that
UIDandGIDof the MySQL user on the first node is 496 and 491 respectively. And theUIDandGIDof the MySQL user in the second node of the cluster is 495 and 490 respectively. - For
mysqlto access the data from/var/lib/mysqlwhich is the default home directory, permission needs to bemysql:mysql.
Because of the differentUIDandGIDof themysqlthe user on both the nodes, when the cluster relocates themysql resourcefrom one node to other node it mounts themysqldata on the second node with the sameUIDandGIDof the first node which does not match themysqlcredentials on the second node due to which themysql resourcewas not able to start on the second node.