maria-developers team mailing list archive

Thread
Date
Re: [All] How are we going to handle incompatible changes in MariaDB?

To: "Philip Stoev" <pstoev@xxxxxxxxxxxx>
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Fri, 18 Mar 2011 09:52:52 +0100
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <5582DA242EED4D949E64F25D9AD41A31@Philips> (Philip Stoev's message of "Thu\, 17 Mar 2011 12\:54\:41 +0200")
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)
"Philip Stoev" <pstoev@xxxxxxxxxxxx> writes:

> We should discuss how do we handle new developments in MariaDB that cause 
> our behavior to diverge from the one in MySQL. Since we claim to be a 

Yes, agree. Thank you for bringing it up!

Note that this is a general MariaDB discussion, not a Monty Program specific
one. So I move the thread to maria-developers@.

So far, we have generally been ok at handing incompatible changes. The basic
approach is to shape new features to be backwards compatible as much as
possible, and to add any extra needed code to handle interoperability.

For example, if an option is changed due to a feature, we should keep the old
option around and emulate the bahaviour, eventually deprecating it in a future
version.

Another example, with replication, we check the version of the master or slave
at the other end, and take care to only send what the other end understands,
and likewise interpret what it sends correctly based on its version.

However, this has been mostly up to the individual developer (or often
reviewer), and if we can get some more structure/process to this to avoid it
being forgotten, then that would be very good.

> The particular situation we are facing now is the microsecond precision 
> patch from Serg. As part of introducing precision date and time datatypes, 
> he made some refactoring around the Item_* tree.
>
> As a result, various corner cases, e.g. invalid dates, partially valid 
> dates, invalid conversions between date and time, etc. are now handled 
> differently, with either different warnings being printed (or no warnings at 
> all), or with the result of the expression being different, or with a NULL 
> or 0000-00-00 returned.
>
> For example, mysql reports that DATE (  SEC_TO_TIME(  8  )  ) is equal to 
> "2000-00-08" whereas serg's patch returns "0000-00-00" . In other words, 
> MySQL chose to place the "8" literal into the days portion of the date 
> value, whereas Serg's patch filtered the literal somewhere along the way. 
> There are numerous other such examples.
>
> While in most cases Serg's new behavior is more intuitive and a program that 
> is written to conform to postgresql-level type safety will not be affected 
> either way, it entirely possible that a program written under mysql's 
> relaxed type safety rules, and with MySQL's behavior in mind, would start 
> behaving unexpectedly.

It sounds like in this case, the changes in behaviour could be considered
improvements, or even bug fixes, when seen in isolation. And the behaviour
they change is not something that an application would deliberately be written
to rely on. But since it is a change in behaviour, it has the potential to
break applications that by accident happen to depend on such behaviour.

My opinion on this is that we should do the change in a major version
(eg. 5.2->5.3), but not in a minor version (eg. 5.2.5->5.2.6). And document it
in release notes as an incompatible change.

Basically the same way we did in MySQL (and still do, I suppose).

It is ok to change behaviour in a major version, if it is necessary, and
documented. I believe users anticipate such changes, and consequently are more
careful about going eg. 5.1->5.5 than 5.1.x->5.1.y.

We should not needlessly change things. Every behaviour change should be
carefully considered. If it has the potential to seriously break major
applications, we should not change it (or use deprecation, compatibility mode,
etc). If the change is not necessary or useful, it should probably be left.

But if the change makes sense and the new behaviour is much more reasonable,
then we should be ready to make the change. We cannot afford to be stuck
forever on old poor decisions.

Another example is optimiser changes. In my opinion, we cannot totally avoid
that improvements to the optimiser will make some corner case queries slower
in some cases. The optimiser has only limited knowledge of the data and
application. If, based on this knowledge, execution plan A is better than
execution plan B, then we should be prepared to change (in a major version)
the code to select A over B. This will invaribly cause slowdown in some cases,
where the knowledge is incomplete. But it is important to be able to make
improvements. But again, we need to consider each case carefully. We should
not do things that are obviously stupid (eg. do not select table scan over
range scan unless _really_ sure it will be better), and we should avoid
breaking major applications or use patterns, even if it theoretically makes
sense.

So summary of my opinion: We should be very careful about incompatible
changes. We should generally avoid them unless there is a very good reason for
them. We should create backward compatibility where possible and where it
makes sense. And if we do change behaviour, we should only do it in a major
version and document it clearly in an "incompatible changes" list. And being
able to continually improve the server long-term is important.

> So, how we should handle this particular case, and more importantly, how do 
> we handle future cases like that in a stable and predictable manner? I 
> understand that in the early days of MySQL it would have been a no-brainer 
> to change everything as needed, we are by now dealing with a huge installed 
> base and powerful competitors, so we can not simply apply the happy hacking 
> approach that was used in the 1990s.

Agree. No "happy hacking" approach to incompatible changes!

 - Kristian.