← Back to team overview

maria-discuss team mailing list archive

Re: Maria-db refuses to start

 

If you're concerned about database corruption then you need to start off by having multiple copies of the data available in an online state if you want to recover from corrupt data without doing a backup. This is what ZFS does with RAID - detects which block is corrupt and then uses other data blocks, including parity, to rebuild the corrupt block. MariaDB isn't designed for that. I'm not even sure if there is any database that's designed for that, including SQL Server (see below for more.)


If you want to protect against silent data corruption now there are some alternatives:


1) implement record level checksums that you calculate and verify. This would require adding a "checksum" field to all of your tables. Maybe hide them with stored procedures. This is the best way for you to detect silent data corruption to your data. It won't fix data corruption.


2) implement application logic to always store data in 2+ databases and write an application to verify that all copies of the data are the same (think of this as being the application of a ZFS scrub.) E.g. Using Microsoft Witness Server's with mirror'd SQL databases:


https://learn.microsoft.com/en-us/sql/database-engine/database-mirroring/database-mirroring-witness?view=sql-server-ver16



The above (1) and (2) won't prevent metadata that the database uses to hold your data from being corrupted. That'd require changes to the page table structures used by InnoDB, etc. On top of that, putting checksums in the page tables wouldn't necessarily detect corruption to your data that happens before it gets included in the page's checksum calculation. If you want to detect corruption of data that is returned over a TCP connection to a remote database then you need record level checksums - see (1) above.


You might even decide to do both (1) and (2) - add checksums to your tables and develop your application to always store (INSERT) all data in 2 different databases (potentially on different servers) and develop the application to query the other database/server if there's a checksum failure when returning data from the first source.


Having multiple databases on different servers also protects you from problems with upgrades, etc, as long as your procedures validate each database is sound before making them "live".

You have to take ownership of the risk for only keeping one copy of critical data. You have to take ownership of the risk for not keeping current backups of your database.

Data resiliency requires effort. If you don't put in any effort then you won't get much in the way of resiliency from corruption. That btw includes good practice for doing ugprades, shutdowns, etc.

MariaDB has as much resiliency to silent data corruption as you put in effort to mitigate it.


Follow ups

References