randgen team mailing list archive

Thread
Date

Re: A question about the RecoveryConsistency Validator

To: randgen@xxxxxxxxxxxxxxxxxxx
From: Philip Stoev <pstoev@xxxxxxxxx>
Date: Fri, 21 May 2010 14:40:51 +0300
Reply-to: Philip Stoev <pstoev@xxxxxxxxx>
Sender: Philip.Stoev@xxxxxxx

Patrick,

I will do my best to expand this area of documentation in the upcoming days,but in the meantime, here is some short info.

The Recovery Reporter crashes the server , attempts recovery and then issuesCHECK|ANALYZE|OPTIMIZE|REPAIR TABLE against each table on the server inorder to check for any corruption. It also issues SELECTs against each tableby using various FORCE INDEX to cause the table and its indexes to be readin various ways.

If CHECK|ANAYLZE|OPTIMIZE|REPAIR report an error, or if the differentSELECTs are not consistent with one another, a recovery failure is reported.Those methods work regardless of the structure of the tables or the actualdata or pre-crash workload, however may not catch all issues. Imagine astorage engine for which all CHECK|ANALYZE|OPTIMIZE|REPAIR are missing andwired to return "Unsupported" and which deletes all of its data on recovery.All SELECTS will report a consistently empty table, so recovery will bereported as successfull.

To cover for this eventuality, the RecoveryConsistency Reporter uses adifferent mechanism. Upon recovery, it performs the following query:


SELECT (SUM(`int_key`)  + SUM(`int`)) / COUNT(*) FROM `$table`

and reports failure if this query , that is, the average of all values inthe int_key and int columns is not 200.

This requires a grammar that performs various invariant transactions thatmove data around but maintain the average of the entire table at 200. If acrash happens and recovery does not consistently recover or roll back entiretransactions, the average will be off and this will be reported. Such agrammar that maintains the invariant principle istransactions/transactions.yy combined with the respective ZZ file. One thingthat can be improved is that the SELECT is issued numerous times using FORCEINDEX in order to make sure that the table remains consistent regardless ofhow the data from it is read.

The two Reporters validate the Durability of ACID to a large extent,protecting against data corruption and incompletely written or recoveredtransactions. One hole however that remains open is if the storage enginelooses entire transactions -- in this case, the database remains consistent,so entire transactions can be lost undetected. The solution for this wouldbe to record the progress of the test and the committed transactions in someseparate storage and then make sure that the server has all the transactionsthat have been recorded in that separate storage.


Philip Stoev

----- Original Message -----From: "Patrick Crews" <gleebix@xxxxxxxxx>

To: <Philip.Stoev@xxxxxxx>
Sent: Friday, May 21, 2010 1:53 AM
Subject: A question about the RecoveryConsistency Validator

Philip,

Hi.  If you have some time to spare, could you describe the
RecoveryConsistency Validator ?  It isn't described here:
http://forge.mysql.com/wiki/RandomQueryGenerator#Reporters
Can / should it work with --threads > 1 ? Do you have any recommendedusagescenarios (ie is it like the Recovery Validator)?