← Back to team overview

maria-discuss team mailing list archive

Re: Maxscale not changing to other viable masters

 

Just some extra info about this issue: I updated MaxScale to version
2.4.8 but the issue I described remains unchanged and unsolved.

I don't have nor want automatic failover, i.e., I don't want nor need
that MaxScale changes which server is slave of which server. I just
want that MaxScale starts using another server as MaxScale's own
master when the previous server gets inaccessible.


Regards,

Rodrigo


Em qua., 15 de abr. de 2020 às 14:43, Rodrigo Severo - Fábrica
<rodrigo@xxxxxxxxxxxxxxxxxxx> escreveu:
>
> Hi,
>
>
> I'm not sure this is the right mailing list to talk about MaxScale.
> Please let me know if there is a better place.
>
> My problem is that MaxScale is not using another server as MaxScale's
> master when the previous master get inaccessible. More details:
>
> I'm running MaxScale 2.4.5 and my MariaDB servers are a mixture of
> 10.3.21 and 10.3.22.
>
> I have three MariaDB servers with a star topology replication GTID
> based setup: each server is a slave of the other 2.
>
> Let's call them S1, S2, and S3.
>
> When starting the whole cluster MaxScale chooses S1 as it's master
> (the one where MaxScale will send the change data
> INSERT/UPDATE/DELETE) as S1 is the first one listed in MaxScale's
> config.
>
> When I stop S1 I expect MaxScale to choose another server as it's new
> master but instead I get:
>
>   28271 2020-04-15 11:54:32   error  : Monitor was unable to connect
> to server S1 : 'Can't connect to MySQL server on 'S1' (115)'
>   28272 2020-04-15 11:54:32   warning: [mariadbmon] 'S2' is a better
> master candidate than the current master 'S1'. Master will change when
> 'S1' is no longer a valid master.
>
> but S2 is never promoted to MaxScale's master, even after
> ((monitor_interval + backend_connect_timeout) * failcount) seconds as
> mentioned in https://mariadb.com/kb/en/mariadb-maxscale-24-mariadb-monitor/#failcount
>
> As I have default values for monitor_interval, backend_connect_timeout
> and failcount - 2s, 3s, 5 retries respectively - I would expect that
> in 25 seconds I would have a new master selected but, unfortunately,
> MaxScale gets stuck in the above situation until I restart the S1
> server, when S1 resumes it's hole as MaxScale's master.
>
> I wonder if my problem is related to item 2 in
> https://mariadb.com/kb/en/mariadb-maxscale-24-mariadb-monitor/#master-selection
> which reads:
>
> 2. It has been down for more than failcount monitor passes and has no
> running slaves. Running slaves behind a downed relay count.
>
> Does the fact that the slaves of S1 (S2 and S3) are running make
> MaxScale consider S1 still valid despite not being accessible at all?
>
> More important: how can I make MaxScale change it's chosen master in a
> star topology replication when the current MaxScale master stops?
>
>
> Regards,
>
> Rodrigo Severo


References