Esta mañá produciuse un fallo hardware nun dos servidores de laboratorio con OVM Server 3.4. Neste contorno, OVM Manager está despregado dentro do propio clúster OVM, e a incidencia provocou a súa caída. O arrincar a máquina de xeito manual dende o servidor OVM co comando xm create, o SO arrincou correctamente, pero non os compoñentes OVM. O motivo? Corrupción de datos na BD MySQL tras a caída abrupta.

180123 17:53:24 mysqld_safe Starting mysqld daemon with databases from /u01/app/oracle/mysql/data
2018-01-23 17:53:24 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
2018-01-23 17:53:24 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2018-01-23 17:53:24 0 [Note] /usr/sbin/mysqld (mysqld 5.6.32-enterprise-commercial-advanced) starting as process 3816 ...
2018-01-23 17:53:24 3816 [Note] Plugin 'FEDERATED' is disabled.
2018-01-23 17:53:24 3816 [Note] InnoDB: Using atomics to ref count buffer pool pages
2018-01-23 17:53:24 3816 [Note] InnoDB: The InnoDB memory heap is disabled
2018-01-23 17:53:24 3816 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-01-23 17:53:24 3816 [Note] InnoDB: Memory barrier is not used
2018-01-23 17:53:24 3816 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-01-23 17:53:24 3816 [Note] InnoDB: Using Linux native AIO
2018-01-23 17:53:24 3816 [Note] InnoDB: Using CPU crc32 instructions
2018-01-23 17:53:24 3816 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2018-01-23 17:53:24 3816 [Note] InnoDB: Completed initialization of buffer pool
2018-01-23 17:53:24 3816 [Note] InnoDB: Highest supported file format is Barracuda.
2018-01-23 17:53:24 3816 [Note] InnoDB: The log sequence numbers 1484386085656 and 1484386085656 in ibdata files do not match the log sequence number 1511364890742 in the ib_logfiles!
2018-01-23 17:53:24 3816 [Note] InnoDB: Database was not shutdown normally!
2018-01-23 17:53:24 3816 [Note] InnoDB: Starting crash recovery.
2018-01-23 17:53:24 3816 [Note] InnoDB: Reading tablespace information from the .ibd files...
2018-01-23 17:53:26 3816 [Note] InnoDB: Restoring possible half-written data pages 
2018-01-23 17:53:26 3816 [Note] InnoDB: from the doublewrite buffer...
2018-01-23 17:53:26 3816 [ERROR] InnoDB: We scanned the log up to 1511364890624. A checkpoint was at 1511364890742 and the maximum LSN on a database page was 0. It is possible that the database is now corrupt!
2018-01-23 17:53:26 7f451f3e8720 InnoDB: Error: page 5 log sequence number 1511364907354
InnoDB: is in the future! Current system log sequence number 1511364890742.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: for more information.
(...)
InnoDB: 3 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 0 row operations to undo
InnoDB: Trx id counter is 1018590976
InnoDB: Cleaning up trx with id 1018590873
2018-01-23 17:53:27 7f451f3e8720 InnoDB: Assertion failure in thread 139934853662496 in file trx0trx.cc line 292
InnoDB: Failing assertion: trx->update_undo == NULL
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
16:53:27 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=33554432
read_buffer_size=2097152
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 653196 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x918445]
/usr/sbin/mysqld(handle_fatal_signal+0x494)[0x69bdd4]
/lib64/libpthread.so.0(+0xf7e0)[0x7f451efc97e0]
/lib64/libc.so.6(gsignal+0x35)[0x7f451da695e5]
/lib64/libc.so.6(abort+0x175)[0x7f451da6adc5]
/usr/sbin/mysqld[0xa48492]
/usr/sbin/mysqld[0xa41758]
/usr/sbin/mysqld[0x9b4488]
/usr/sbin/mysqld[0xa2f975]
/usr/sbin/mysqld[0x973e0d]
/usr/sbin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x48)[0x5df848]
/usr/sbin/mysqld[0x725cb1]
/usr/sbin/mysqld(_Z11plugin_initPiPPci+0xbb6)[0x729b16]
/usr/sbin/mysqld[0x5d21c8]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x3e5)[0x5d6de5]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f451da55d1d]
/usr/sbin/mysqld[0x5c83a5]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
180123 17:53:27 mysqld_safe mysqld from pid file /u01/app/oracle/mysql/data/mysqld.pid ended

Ten mala pinta, pero aínda que a primeira opción pareza ser a de entrar en MySQL para recuperar a situación, dado que non se produciron cambios na configuración deste clúster, optamos pola vía rápida e integrada en OVM, a ferramenta de restauración MySQL de OVMM. Para elo, basta seguir a documentación clara de Oracle que podemos ver aquí. No noso caso, só queremos recuperar a nivel de BD, polo que non precisamos executar o paso número 1.

OVMM xera de xeito automático e diariamente un backup de MySQL na ruta /u01/app/oracle/mysql/backup (OVM 3.4). Vemos o contido no noso contorno:

[oracle@ovmmanager ~]$ ls -latr /u01/app/oracle/mysql/dbbackup/
total 2592
drwx------  4 oracle oinstall   4096 Apr  1  2015 3.3.1_preUpgradeBackup-20150401_114910
drwx------  4 oracle oinstall   4096 Mar 10  2016 3.3.2_preUpgradeBackup-20160310_145455
drwx------  4 oracle oinstall   4096 Mar 28  2016 3.3.3_preUpgradeBackup-20160328_133313
drwx------  4 oracle oinstall   4096 Dec 29  2016 3.4.1_preUpgradeBackup-20161229_195435
drwx------  4 oracle oinstall   4096 Sep 10 19:24 AutoFullBackup-20170910_192222
drwx------  4 oracle oinstall   4096 Sep 11 19:25 AutoFullBackup-20170911_192432
drwx------  4 oracle oinstall   4096 Sep 12 19:27 AutoFullBackup-20170912_192620
drwx------  4 oracle oinstall   4096 Sep 13 19:28 AutoFullBackup-20170913_192733
drwxr-xr-x  5 oracle dba        4096 Jan  5 12:18 ..
drwx------  4 oracle oinstall   4096 Jan  6 12:38 AutoFullBackup-20180106_123729
drwx------  4 oracle oinstall   4096 Jan  7 12:39 AutoFullBackup-20180107_123822
drwx------  4 oracle oinstall   4096 Jan  8 12:40 AutoFullBackup-20180108_123931
drwx------  4 oracle oinstall   4096 Jan  9 12:41 AutoFullBackup-20180109_124037
drwx------  4 oracle oinstall   4096 Jan 10 12:42 AutoFullBackup-20180110_124144
drwx------  4 oracle oinstall   4096 Jan 11 16:52 AutoFullBackup-20180111_165134
drwx------  4 oracle oinstall   4096 Jan 12 16:53 AutoFullBackup-20180112_165229
drwx------  4 oracle oinstall   4096 Jan 13 16:54 AutoFullBackup-20180113_165327
-rw-r-----  1 oracle oinstall 251602 Jan 13 17:21 OVMModelExport-20180113_172134.xml
drwx------  4 oracle oinstall   4096 Jan 14 16:55 AutoFullBackup-20180114_165436
-rw-r-----  1 oracle oinstall 254527 Jan 14 17:21 OVMModelExport-20180114_172136.xml
drwx------  4 oracle oinstall   4096 Jan 15 16:56 AutoFullBackup-20180115_165545
-rw-r-----  1 oracle oinstall 257452 Jan 15 17:21 OVMModelExport-20180115_172138.xml
drwx------  4 oracle oinstall   4096 Jan 16 16:57 AutoFullBackup-20180116_165655
-rw-r-----  1 oracle oinstall 260363 Jan 16 17:21 OVMModelExport-20180116_172141.xml
drwx------  4 oracle oinstall   4096 Jan 17 16:59 AutoFullBackup-20180117_165804
-rw-r-----  1 oracle oinstall 263280 Jan 17 17:21 OVMModelExport-20180117_172144.xml
drwx------  4 oracle oinstall   4096 Jan 18 17:00 AutoFullBackup-20180118_165918
-rw-r-----  1 oracle oinstall 241287 Jan 18 17:21 OVMModelExport-20180118_172147.xml
drwx------  4 oracle oinstall   4096 Jan 19 17:01 AutoFullBackup-20180119_170019
-rw-r-----  1 oracle oinstall 244200 Jan 19 17:21 OVMModelExport-20180119_172149.xml
drwx------  4 oracle oinstall   4096 Jan 20 17:02 AutoFullBackup-20180120_170140
-rw-r-----  1 oracle oinstall 247125 Jan 20 17:21 OVMModelExport-20180120_172151.xml
drwx------  4 oracle oinstall   4096 Jan 21 17:03 AutoFullBackup-20180121_170254
-rw-r-----  1 oracle oinstall 250050 Jan 21 17:21 OVMModelExport-20180121_172154.xml
drwx------  4 oracle oinstall   4096 Jan 22 17:05 AutoFullBackup-20180122_170402
-rw-r-----  1 oracle oinstall 250050 Jan 22 17:21 OVMModelExport-20180122_172156.xml
drwxr-xr-x 27 oracle dba        4096 Jan 22 17:21 .

A última copia, de onte, é AutoFullBackup-20180122_170402. Con esta información, en primeiro lugar paramos todo OVMM (Weblogic OVM Server, Oracle VM Command Line Interface and MySQL database):

[root@ovmmanager ~]# service ovmm stop
Stopping Oracle VM Manager                                 [  OK  ]

[root@ovmmanager ~]# service ovmcli stop
Stopping Oracle VM Manager CLI                             [  OK  ]

[root@ovmmanager ~]# service ovmm_mysql stop
Shutting down OVMM MySQL.. SUCCESS!

Agora, como usuario oracle lanzamos o shell script RestoreDatabase.sh especificando o backup dende o que queremos facer a recuperación.

[oracle@ovmmanager ~]$ /u01/app/oracle/ovm-manager-3/ovm_tools/bin/RestoreDatabase.sh AutoFullBackup-20180122_170402

Before the database can be restored, the following database directories/files must be deleted:
appfw ibdata1 ib_logfile0 ib_logfile1 mysql ovs performance_schema

Are you sure it is safe to delete these directories/files now? [y,n] y
Deleting /u01/app/oracle/mysql/data/appfw
Deleting /u01/app/oracle/mysql/data/ibdata1
Deleting /u01/app/oracle/mysql/data/ib_logfile0
Deleting /u01/app/oracle/mysql/data/ib_logfile1
Deleting /u01/app/oracle/mysql/data/mysql
Deleting /u01/app/oracle/mysql/data/ovs
Deleting /u01/app/oracle/mysql/data/performance_schema
INFO: Expanding the backup image...
INFO: Applying logs to the backup snapshot...
INFO: Restoring the backup...
INFO: Restoring OVM keystores and certificates
INFO: Success - Done!
INFO: Log of operations performed is available at: /u01/app/oracle/mysql/dbbackup/AutoFullBackup-20180122_170402/Restore.log


IMPORTANT:

      As 'root', please start the OVM Manager database and application using:
            service ovmm_mysql start; service ovmm start; service ovmcli start


[oracle@ovmmanager ~]$

Arrincamos de novo os servizos:

[root@ovmmanager ~]# service ovmm_mysql start;
Starting OVMM MySQL...... SUCCESS! 

[root@ovmmanager ~]# service ovmm start
Starting Oracle VM Manager                                 [  OK  ]

[root@ovmmanager ~]# service ovmcli start
Starting Oracle VM Manager CLI                             [  OK  ]

Todo arrinca OK, incluída a BD. Recuperamos a normalidade!

180123 18:03:50 mysqld_safe Starting mysqld daemon with databases from /u01/app/oracle/mysql/data
2018-01-23 18:03:50 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
2018-01-23 18:03:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2018-01-23 18:03:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.32-enterprise-commercial-advanced) starting as process 4526 ...
2018-01-23 18:03:50 4526 [Note] Plugin 'FEDERATED' is disabled.
2018-01-23 18:03:51 4526 [Note] InnoDB: Using atomics to ref count buffer pool pages
2018-01-23 18:03:51 4526 [Note] InnoDB: The InnoDB memory heap is disabled
2018-01-23 18:03:51 4526 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-01-23 18:03:51 4526 [Note] InnoDB: Memory barrier is not used
2018-01-23 18:03:51 4526 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-01-23 18:03:51 4526 [Note] InnoDB: Using Linux native AIO
2018-01-23 18:03:51 4526 [Note] InnoDB: Using CPU crc32 instructions
2018-01-23 18:03:51 4526 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2018-01-23 18:03:51 4526 [Note] InnoDB: Completed initialization of buffer pool
2018-01-23 18:03:51 4526 [Note] InnoDB: Highest supported file format is Barracuda.
2018-01-23 18:03:51 4526 [Note] InnoDB: The log file was created by mysqlbackup --apply-log at ibbackup 180123 17:59:17. The following crash recovery is part of a normal restore.
2018-01-23 18:03:51 4526 [Note] InnoDB: The log sequence numbers 1484386085656 and 1484386085656 in ibdata files do not match the log sequence number 1511045530636 in the ib_logfiles!
2018-01-23 18:03:51 4526 [Note] InnoDB: Database was not shutdown normally!
2018-01-23 18:03:51 4526 [Note] InnoDB: Starting crash recovery.
2018-01-23 18:03:51 4526 [Note] InnoDB: Reading tablespace information from the .ibd files...
2018-01-23 18:03:54 4526 [Note] InnoDB: Restoring possible half-written data pages 
2018-01-23 18:03:54 4526 [Note] InnoDB: from the doublewrite buffer...
2018-01-23 18:03:55 4526 [Note] InnoDB: 128 rollback segment(s) are active.
2018-01-23 18:03:55 4526 [Note] InnoDB: Waiting for purge to start
2018-01-23 18:03:55 4526 [Note] InnoDB: 5.6.32 started; log sequence number 1511045530636
2018-01-23 18:03:55 4526 [Note] RSA private key file not found: /u01/app/oracle/mysql/data//private_key.pem. Some authentication plugins will not work.
2018-01-23 18:03:55 4526 [Note] RSA public key file not found: /u01/app/oracle/mysql/data//public_key.pem. Some authentication plugins will not work.
2018-01-23 18:03:55 4526 [Note] Server hostname (bind-address): '*'; port: 49500
2018-01-23 18:03:55 4526 [Note] IPv6 is available.
2018-01-23 18:03:55 4526 [Note]   - '::' resolves to '::';
2018-01-23 18:03:55 4526 [Note] Server socket created on IP: '::'.
2018-01-23 18:03:55 4526 [Note] Event Scheduler: Loaded 0 events
2018-01-23 18:03:55 4526 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.32-enterprise-commercial-advanced'  socket: '/u01/app/oracle/mysql/data/mysqld.sock'  port: 49500  MySQL Enterprise Server - Advanced Edition (Commercial)