Esta mañá produciuse un fallo hardware nun dos servidores de laboratorio con OVM Server 3.4. Neste contorno, OVM Manager está despregado dentro do propio clúster OVM, e a incidencia provocou a súa caída. O arrincar a máquina de xeito manual dende o servidor OVM co comando xm create, o SO arrincou correctamente, pero non os compoñentes OVM. O motivo? Corrupción de datos na BD MySQL tras a caída abrupta.
180123 17:53:24 mysqld_safe Starting mysqld daemon with databases from /u01/app/oracle/mysql/data 2018-01-23 17:53:24 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release. 2018-01-23 17:53:24 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2018-01-23 17:53:24 0 [Note] /usr/sbin/mysqld (mysqld 5.6.32-enterprise-commercial-advanced) starting as process 3816 ... 2018-01-23 17:53:24 3816 [Note] Plugin 'FEDERATED' is disabled. 2018-01-23 17:53:24 3816 [Note] InnoDB: Using atomics to ref count buffer pool pages 2018-01-23 17:53:24 3816 [Note] InnoDB: The InnoDB memory heap is disabled 2018-01-23 17:53:24 3816 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2018-01-23 17:53:24 3816 [Note] InnoDB: Memory barrier is not used 2018-01-23 17:53:24 3816 [Note] InnoDB: Compressed tables use zlib 1.2.3 2018-01-23 17:53:24 3816 [Note] InnoDB: Using Linux native AIO 2018-01-23 17:53:24 3816 [Note] InnoDB: Using CPU crc32 instructions 2018-01-23 17:53:24 3816 [Note] InnoDB: Initializing buffer pool, size = 1.0G 2018-01-23 17:53:24 3816 [Note] InnoDB: Completed initialization of buffer pool 2018-01-23 17:53:24 3816 [Note] InnoDB: Highest supported file format is Barracuda. 2018-01-23 17:53:24 3816 [Note] InnoDB: The log sequence numbers 1484386085656 and 1484386085656 in ibdata files do not match the log sequence number 1511364890742 in the ib_logfiles! 2018-01-23 17:53:24 3816 [Note] InnoDB: Database was not shutdown normally! 2018-01-23 17:53:24 3816 [Note] InnoDB: Starting crash recovery. 2018-01-23 17:53:24 3816 [Note] InnoDB: Reading tablespace information from the .ibd files... 2018-01-23 17:53:26 3816 [Note] InnoDB: Restoring possible half-written data pages 2018-01-23 17:53:26 3816 [Note] InnoDB: from the doublewrite buffer... 2018-01-23 17:53:26 3816 [ERROR] InnoDB: We scanned the log up to 1511364890624. A checkpoint was at 1511364890742 and the maximum LSN on a database page was 0. It is possible that the database is now corrupt! 2018-01-23 17:53:26 7f451f3e8720 InnoDB: Error: page 5 log sequence number 1511364907354 InnoDB: is in the future! Current system log sequence number 1511364890742. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. (...) InnoDB: 3 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 1018590976 InnoDB: Cleaning up trx with id 1018590873 2018-01-23 17:53:27 7f451f3e8720 InnoDB: Assertion failure in thread 139934853662496 in file trx0trx.cc line 292 InnoDB: Failing assertion: trx->update_undo == NULL InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 16:53:27 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=33554432 read_buffer_size=2097152 max_used_connections=0 max_threads=151 thread_count=0 connection_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 653196 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x40000 /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x918445] /usr/sbin/mysqld(handle_fatal_signal+0x494)[0x69bdd4] /lib64/libpthread.so.0(+0xf7e0)[0x7f451efc97e0] /lib64/libc.so.6(gsignal+0x35)[0x7f451da695e5] /lib64/libc.so.6(abort+0x175)[0x7f451da6adc5] /usr/sbin/mysqld[0xa48492] /usr/sbin/mysqld[0xa41758] /usr/sbin/mysqld[0x9b4488] /usr/sbin/mysqld[0xa2f975] /usr/sbin/mysqld[0x973e0d] /usr/sbin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x48)[0x5df848] /usr/sbin/mysqld[0x725cb1] /usr/sbin/mysqld(_Z11plugin_initPiPPci+0xbb6)[0x729b16] /usr/sbin/mysqld[0x5d21c8] /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x3e5)[0x5d6de5] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f451da55d1d] /usr/sbin/mysqld[0x5c83a5] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 180123 17:53:27 mysqld_safe mysqld from pid file /u01/app/oracle/mysql/data/mysqld.pid ended
Ten mala pinta, pero aínda que a primeira opción pareza ser a de entrar en MySQL para recuperar a situación, dado que non se produciron cambios na configuración deste clúster, optamos pola vía rápida e integrada en OVM, a ferramenta de restauración MySQL de OVMM. Para elo, basta seguir a documentación clara de Oracle que podemos ver aquí. No noso caso, só queremos recuperar a nivel de BD, polo que non precisamos executar o paso número 1.
OVMM xera de xeito automático e diariamente un backup de MySQL na ruta /u01/app/oracle/mysql/backup (OVM 3.4). Vemos o contido no noso contorno:
[oracle@ovmmanager ~]$ ls -latr /u01/app/oracle/mysql/dbbackup/ total 2592 drwx------ 4 oracle oinstall 4096 Apr 1 2015 3.3.1_preUpgradeBackup-20150401_114910 drwx------ 4 oracle oinstall 4096 Mar 10 2016 3.3.2_preUpgradeBackup-20160310_145455 drwx------ 4 oracle oinstall 4096 Mar 28 2016 3.3.3_preUpgradeBackup-20160328_133313 drwx------ 4 oracle oinstall 4096 Dec 29 2016 3.4.1_preUpgradeBackup-20161229_195435 drwx------ 4 oracle oinstall 4096 Sep 10 19:24 AutoFullBackup-20170910_192222 drwx------ 4 oracle oinstall 4096 Sep 11 19:25 AutoFullBackup-20170911_192432 drwx------ 4 oracle oinstall 4096 Sep 12 19:27 AutoFullBackup-20170912_192620 drwx------ 4 oracle oinstall 4096 Sep 13 19:28 AutoFullBackup-20170913_192733 drwxr-xr-x 5 oracle dba 4096 Jan 5 12:18 .. drwx------ 4 oracle oinstall 4096 Jan 6 12:38 AutoFullBackup-20180106_123729 drwx------ 4 oracle oinstall 4096 Jan 7 12:39 AutoFullBackup-20180107_123822 drwx------ 4 oracle oinstall 4096 Jan 8 12:40 AutoFullBackup-20180108_123931 drwx------ 4 oracle oinstall 4096 Jan 9 12:41 AutoFullBackup-20180109_124037 drwx------ 4 oracle oinstall 4096 Jan 10 12:42 AutoFullBackup-20180110_124144 drwx------ 4 oracle oinstall 4096 Jan 11 16:52 AutoFullBackup-20180111_165134 drwx------ 4 oracle oinstall 4096 Jan 12 16:53 AutoFullBackup-20180112_165229 drwx------ 4 oracle oinstall 4096 Jan 13 16:54 AutoFullBackup-20180113_165327 -rw-r----- 1 oracle oinstall 251602 Jan 13 17:21 OVMModelExport-20180113_172134.xml drwx------ 4 oracle oinstall 4096 Jan 14 16:55 AutoFullBackup-20180114_165436 -rw-r----- 1 oracle oinstall 254527 Jan 14 17:21 OVMModelExport-20180114_172136.xml drwx------ 4 oracle oinstall 4096 Jan 15 16:56 AutoFullBackup-20180115_165545 -rw-r----- 1 oracle oinstall 257452 Jan 15 17:21 OVMModelExport-20180115_172138.xml drwx------ 4 oracle oinstall 4096 Jan 16 16:57 AutoFullBackup-20180116_165655 -rw-r----- 1 oracle oinstall 260363 Jan 16 17:21 OVMModelExport-20180116_172141.xml drwx------ 4 oracle oinstall 4096 Jan 17 16:59 AutoFullBackup-20180117_165804 -rw-r----- 1 oracle oinstall 263280 Jan 17 17:21 OVMModelExport-20180117_172144.xml drwx------ 4 oracle oinstall 4096 Jan 18 17:00 AutoFullBackup-20180118_165918 -rw-r----- 1 oracle oinstall 241287 Jan 18 17:21 OVMModelExport-20180118_172147.xml drwx------ 4 oracle oinstall 4096 Jan 19 17:01 AutoFullBackup-20180119_170019 -rw-r----- 1 oracle oinstall 244200 Jan 19 17:21 OVMModelExport-20180119_172149.xml drwx------ 4 oracle oinstall 4096 Jan 20 17:02 AutoFullBackup-20180120_170140 -rw-r----- 1 oracle oinstall 247125 Jan 20 17:21 OVMModelExport-20180120_172151.xml drwx------ 4 oracle oinstall 4096 Jan 21 17:03 AutoFullBackup-20180121_170254 -rw-r----- 1 oracle oinstall 250050 Jan 21 17:21 OVMModelExport-20180121_172154.xml drwx------ 4 oracle oinstall 4096 Jan 22 17:05 AutoFullBackup-20180122_170402 -rw-r----- 1 oracle oinstall 250050 Jan 22 17:21 OVMModelExport-20180122_172156.xml drwxr-xr-x 27 oracle dba 4096 Jan 22 17:21 .
A última copia, de onte, é AutoFullBackup-20180122_170402. Con esta información, en primeiro lugar paramos todo OVMM (Weblogic OVM Server, Oracle VM Command Line Interface and MySQL database):
[root@ovmmanager ~]# service ovmm stop Stopping Oracle VM Manager [ OK ] [root@ovmmanager ~]# service ovmcli stop Stopping Oracle VM Manager CLI [ OK ] [root@ovmmanager ~]# service ovmm_mysql stop Shutting down OVMM MySQL.. SUCCESS!
Agora, como usuario oracle lanzamos o shell script RestoreDatabase.sh especificando o backup dende o que queremos facer a recuperación.
[oracle@ovmmanager ~]$ /u01/app/oracle/ovm-manager-3/ovm_tools/bin/RestoreDatabase.sh AutoFullBackup-20180122_170402 Before the database can be restored, the following database directories/files must be deleted: appfw ibdata1 ib_logfile0 ib_logfile1 mysql ovs performance_schema Are you sure it is safe to delete these directories/files now? [y,n] y Deleting /u01/app/oracle/mysql/data/appfw Deleting /u01/app/oracle/mysql/data/ibdata1 Deleting /u01/app/oracle/mysql/data/ib_logfile0 Deleting /u01/app/oracle/mysql/data/ib_logfile1 Deleting /u01/app/oracle/mysql/data/mysql Deleting /u01/app/oracle/mysql/data/ovs Deleting /u01/app/oracle/mysql/data/performance_schema INFO: Expanding the backup image... INFO: Applying logs to the backup snapshot... INFO: Restoring the backup... INFO: Restoring OVM keystores and certificates INFO: Success - Done! INFO: Log of operations performed is available at: /u01/app/oracle/mysql/dbbackup/AutoFullBackup-20180122_170402/Restore.log IMPORTANT: As 'root', please start the OVM Manager database and application using: service ovmm_mysql start; service ovmm start; service ovmcli start [oracle@ovmmanager ~]$
Arrincamos de novo os servizos:
[root@ovmmanager ~]# service ovmm_mysql start; Starting OVMM MySQL...... SUCCESS! [root@ovmmanager ~]# service ovmm start Starting Oracle VM Manager [ OK ] [root@ovmmanager ~]# service ovmcli start Starting Oracle VM Manager CLI [ OK ]
Todo arrinca OK, incluída a BD. Recuperamos a normalidade!
180123 18:03:50 mysqld_safe Starting mysqld daemon with databases from /u01/app/oracle/mysql/data 2018-01-23 18:03:50 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release. 2018-01-23 18:03:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2018-01-23 18:03:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.32-enterprise-commercial-advanced) starting as process 4526 ... 2018-01-23 18:03:50 4526 [Note] Plugin 'FEDERATED' is disabled. 2018-01-23 18:03:51 4526 [Note] InnoDB: Using atomics to ref count buffer pool pages 2018-01-23 18:03:51 4526 [Note] InnoDB: The InnoDB memory heap is disabled 2018-01-23 18:03:51 4526 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2018-01-23 18:03:51 4526 [Note] InnoDB: Memory barrier is not used 2018-01-23 18:03:51 4526 [Note] InnoDB: Compressed tables use zlib 1.2.3 2018-01-23 18:03:51 4526 [Note] InnoDB: Using Linux native AIO 2018-01-23 18:03:51 4526 [Note] InnoDB: Using CPU crc32 instructions 2018-01-23 18:03:51 4526 [Note] InnoDB: Initializing buffer pool, size = 1.0G 2018-01-23 18:03:51 4526 [Note] InnoDB: Completed initialization of buffer pool 2018-01-23 18:03:51 4526 [Note] InnoDB: Highest supported file format is Barracuda. 2018-01-23 18:03:51 4526 [Note] InnoDB: The log file was created by mysqlbackup --apply-log at ibbackup 180123 17:59:17. The following crash recovery is part of a normal restore. 2018-01-23 18:03:51 4526 [Note] InnoDB: The log sequence numbers 1484386085656 and 1484386085656 in ibdata files do not match the log sequence number 1511045530636 in the ib_logfiles! 2018-01-23 18:03:51 4526 [Note] InnoDB: Database was not shutdown normally! 2018-01-23 18:03:51 4526 [Note] InnoDB: Starting crash recovery. 2018-01-23 18:03:51 4526 [Note] InnoDB: Reading tablespace information from the .ibd files... 2018-01-23 18:03:54 4526 [Note] InnoDB: Restoring possible half-written data pages 2018-01-23 18:03:54 4526 [Note] InnoDB: from the doublewrite buffer... 2018-01-23 18:03:55 4526 [Note] InnoDB: 128 rollback segment(s) are active. 2018-01-23 18:03:55 4526 [Note] InnoDB: Waiting for purge to start 2018-01-23 18:03:55 4526 [Note] InnoDB: 5.6.32 started; log sequence number 1511045530636 2018-01-23 18:03:55 4526 [Note] RSA private key file not found: /u01/app/oracle/mysql/data//private_key.pem. Some authentication plugins will not work. 2018-01-23 18:03:55 4526 [Note] RSA public key file not found: /u01/app/oracle/mysql/data//public_key.pem. Some authentication plugins will not work. 2018-01-23 18:03:55 4526 [Note] Server hostname (bind-address): '*'; port: 49500 2018-01-23 18:03:55 4526 [Note] IPv6 is available. 2018-01-23 18:03:55 4526 [Note] - '::' resolves to '::'; 2018-01-23 18:03:55 4526 [Note] Server socket created on IP: '::'. 2018-01-23 18:03:55 4526 [Note] Event Scheduler: Loaded 0 events 2018-01-23 18:03:55 4526 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.32-enterprise-commercial-advanced' socket: '/u01/app/oracle/mysql/data/mysqld.sock' port: 49500 MySQL Enterprise Server - Advanced Edition (Commercial)