← Back to team overview

maria-discuss team mailing list archive

Re: Maria-db refuses to start

 

On Thu, Dec 8, 2022 at 9:42 PM Reindl Harald <h.reindl@xxxxxxxxxxxxx> wrote:
>
>
>
> Am 08.12.22 um 18:59 schrieb Gordan Bobic:
> > On Thu, Dec 8, 2022 at 7:28 PM Reindl Harald <h.reindl@xxxxxxxxxxxxx> wrote:
> >> MariaDB does the same as the filesystem
> >> InnoDB in fact is more ore less a FS on top of a FS
> >
> > So why do it at both levels?
>
> because the FS layer can't detect MariaDB errors?

What is the net benefit of detecting said error? The way I see it, the
options are:
1) MariaDB detects and error, crashes out
2) MariaDB doesn't detect an error, ingests garbage, crashes out

The only way an error will creep in without the error checking FS
spotting it is:
1) manually corrupting the block by writing garbage over it
2) MariaDB generated a duff block and wrote it out
3) Some other hardware failure corrupted the block in MariaDB memory,
just before writing it to the file system

If any of the latter set happens, the data is toast anyway.
If the former set happens, the data is toast anyway.

Sure it's nice to get an error that confirms that your data is
corrupted, but that won't bring it back.
Shift it down to a layer where at least a subset of the problemspace
can be fixed and we have a net gain in at least some cases.

> > And what makes doing it at MariaDB level
> > in any way better than doing it somewhere else?
>
> which magic should do it somewehre else?

If a file system is in control of data mirroring and checksumming
every 16KB block, then if the data is corrupted on disk, file system
will detect it on a read and fetch a good copy from an uncorrupted
mirror.
Application never knows something went wrong, file system replaces the
corrupted block on the other disk and everything carries on
uninterrupted.

> >> "Some of us run MariaDB on file systems that do their own block
> >> checksumming, and thus run innodb_checksum_algorithm=none" makes you
> >> looking like a fool - period
> >>
> >> are you dumb or why don't you understand that the filesystem is a
> >> completly different layer and has no clue about the data itself?
> >
> > Are you too dumb to understand that if a block is corrupted at InnoDB
> > level MariaDB can't do anyting to fix it, but if a block is corrupted
> > at lower level, ZFS can fix it from redundantly stored data and
> > MariaDB never gets to ingest a corrupted block in the first place?
>
> it can at least fail early instead work with corrupted data
>
> > If you disagree, please describe a scenario in which an InnoDB page
> > checksum does anything useful if the file system it is on has built in
> > block checksumming and data redundancy.
> INNDOB CHECKSUMS DETECT DATA CORRUPTION WITHIN MARIADB NOT CAUSED BY ANY
> FILESYSTEM ISSUE AT ALL
>
> the filesystem can't do that with it's block checksumming and data
> redundancy because there is *nothing wrong* for the view of the FS layer
>
> one is for consistency of the database
> one is for consistency of the underlying filesystem
>
> two worlds and i simply don't get why people not understanding such
> basics  work in the IT but to top that talking nonsense on mailing-lists
>
> AGAIN: THE FILESYSTEM CAN'T DO ANYTHING BECAUSE IT'S NOT AFFECTED

So your argument is that the page checksum is there to tell you that
your data is corrupted because of either:
1) data was corrupted by the database itself, or
2) a superuser overwriting the block on the disk

In either case, you are not getting the data back either way the the
database will stop working.
So what is the point? I'd rather have the error fixable than have the
knowledge that I have an error I can do nothing about.


Follow ups

References