maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #12919
Re: 4b164f176e6: MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
Hi, Jan!
On Oct 10, Jan Lindström wrote:
> Hi Sergei,
> >
> > > if (victim_trx) {
> > > const trx_id_t victim_trx_id= victim_trx->id;
> > > const longlong victim_thread= thd_get_thread_id(victim_thd);
> > > /* This is necessary as correct mutexing order is
> > > lock_sys -> trx -> THD::LOCK_thd_data and below
> > > function assumes we have lock_sys and trx locked
> > > and takes THD::LOCK_thd_data for THD state check. */
> > > wsrep_thd_UNLOCK(victim_thd);
> > > // GAP where thd or trx is not protected
> > > lock_mutex_enter();
> > > if (trx_t* victim= trx_rw_is_active(victim_trx_id, NULL, true)) {
> >
> > trx_rw_is_active needs to be modified to do that, right?
>
> No this is current behaviour, I did not change anything on
> trx_rw_is_active
In xtradb trx_rw_is_active returns bool.
I think xtradb is still the default innodb in 10.2.
In innobase it returns, indeed, trx_t*, I didn't notice that at first,
that's why I was confused.
> > > // As trx is now referenced it can't go away
> >
> > Hmm. What happens if the thd that owns this transaction is killed or
> > the user disconnects? THD gets freed. What happens to the referenced
> > trx?
>
> In my understanding you can't just free THD before it is aborted or
> committed, right ?
> As we have lock_sys, no trx can commit or abort inside InnoDB, and
> after this function this trx can't be deleted.
okay, good point.
> > What I mean it, what if KILL would ignore WSREP_TO_ISOLATION_BEGIN
> > failure and will just proceed killing? Perhaps if
> > WSREP_TO_ISOLATION_BEGIN fails it means that there can be no bf aborts
> > anyway? Could you try to find it out?
>
> User KILL can happen only after the node has moded to READY state so
> at startup you can't use it before the cluster is ready to serve. We
> could just ignore the TOI error here, but what is the point? There are
> bigger problems in the cluster if TOI fails. TOI can fail only in this
> node as all other nodes in the cluster will ignore the KILL command
> (after parsing it).
Okay then
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups
References