maria-developers team mailing list archive

Thread
Date

Re: Sprint 10.0: MDEV-8205 timediff returns null when comparing decimal time to time string value

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: Alexander Barkov <bar@xxxxxxxxxxx>
Date: Tue, 16 Jun 2015 19:20:23 +0400
Cc: maria-developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20150615084916.GA6876@meddwl.fritz.box>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

Hi Sergei,

I created this task based on our discussion:

MDEV-8322 Distinguish between time and date strings more carefully

Please also see comments below:

On 06/15/2015 12:49 PM, Sergei Golubchik wrote:

Hi, Alexander!

On Jun 15, Alexander Barkov wrote:

(preference matters, as it tells how to parse ambiguous strings like
"10:10:10").


I'd say '10:10:10' should be unambiguously treated as time.
Colon is never used to delimit date parts. Is it?


yes, I believe delimiters are pretty much ignored in our code. so any
delimiter can be used anywhere.


Right, in MariaDB they are ignored.

But I mean in real life nobody uses colons to separate date parts.

Date parts are usually delimited by as follows:
'01-01-01'
'01.01.01'
'01/01/01'

But this is a kind of separate issue. Would you like me to create a task
for this?


The way it works now - after your patch - there's no much need for a
"preference" flag. The only issue I've uncovered in testing was related
to parsing strings with time preference. Like in

   WHERE time_column > '2010-12-11'

The code is

   my_bool str_to_time(const char *str, uint length, MYSQL_TIME *l_time,
                       ulonglong fuzzydate, MYSQL_TIME_STATUS *status)
   {
   ...
     /* Check first if this is a full TIMESTAMP */
     if (length >= 12)
     {                                             /* Probably full timestamp */
       (void) str_to_datetime(str, length, l_time,
                              (fuzzydate & ~TIME_TIME_ONLY) | TIME_DATETIME_ONLY,
                               status);

Which is very stupid, it decides solely on the string length. That is
'2010-12-11' is parsed as a time (when there's time preference), but
'10:11:12.123456' is parsed as a date (but fails and falls back to
time).

I would suggest to get rid of this ad hoc detection code (check the
length, try and fall back, etc). And use a systematic approach based on
patterns. Like

    patterns[]=
    {
      { 'yyyy-mm-dd' , parse_date },
      { 'hh:mm:ss.uuuuuu', parse_time },
      ...
    }

Note, I wrote "like". I do not mean literally these patterns or string
patterns whatsoever. I'd prefer something much faster.  May be some
compact "signature" number that describes the format, or may be a
decision tree (where the string is parsed into an array of ints and then
analyzed like three numbers? first is 4 digit? etc).

Regards,
Sergei

References

Sprint 10.0: MDEV-8205 timediff returns null when comparing decimal time to time string value
From: Alexander Barkov, 2015-06-11
Re: Sprint 10.0: MDEV-8205 timediff returns null when comparing decimal time to time string value
From: Sergei Golubchik, 2015-06-12
Re: Sprint 10.0: MDEV-8205 timediff returns null when comparing decimal time to time string value
From: Alexander Barkov, 2015-06-15
Re: Sprint 10.0: MDEV-8205 timediff returns null when comparing decimal time to time string value
From: Sergei Golubchik, 2015-06-15