← Back to team overview

maria-developers team mailing list archive

Re: Review of microsecond patch

 

Hi!

>>>>> "Sergei" == Sergei Golubchik <serg@xxxxxxxxxxxx> writes:

<cut>

>> > === modified file 'mysql-test/r/date_formats.result'
>> > --- mysql-test/r/date_formats.result	2010-11-12 10:12:15 +0000
>> > +++ mysql-test/r/date_formats.result	2011-03-08 18:41:58 +0000
>> > @@ -196,16 +196,16 @@ date	format	datetime
>> >  0003-01-02 8:11:2.123456	%Y-%m-%d %H:%i:%S.%#	0003-01-02 08:11:02
>> >  03-01-02 8:11:2.123456	%Y-%m-%d %H:%i:%S.%#	2003-01-02 08:11:02
>> >  2003-01-02 10:11:12 PM	%Y-%m-%d %h:%i:%S %p	2003-01-02 22:11:12
>> > -2003-01-02 01:11:12.12345AM	%Y-%m-%d %h:%i:%S.%f%p	2003-01-02 01:11:12.123450
>> > -2003-01-02 02:11:12.12345AM	%Y-%m-%d %h:%i:%S.%f %p	2003-01-02 02:11:12.123450
>> > -2003-01-02 12:11:12.12345 am	%Y-%m-%d %h:%i:%S.%f%p	2003-01-02 00:11:12.123450
>> > +2003-01-02 01:11:12.12345AM	%Y-%m-%d %h:%i:%S.%f%p	2003-01-02 01:11:12
>> > +2003-01-02 02:11:12.12345AM	%Y-%m-%d %h:%i:%S.%f %p	2003-01-02 02:11:12
>> > +2003-01-02 12:11:12.12345 am	%Y-%m-%d %h:%i:%S.%f%p	2003-01-02 00:11:12
>> 
>> Why has microseconds been removed from the output ?
>> 
>> It looks like 'cast(str_to_date(date, format) as datetime)' removes
>> all microseconds, which is an incompatible change.  In the past
>> cast(..., datetime) as same as cast(..., datetime(6))
>> 
>> <cut>
>> 
>> I think that it's more important that cast() produces the same value
>> as MySQL did than that create ... select cast() produces the same
>> type as before.
>> 
>> With the first issue, you will get lost data (without any warnings)
>> for old applictions if they would do things like:
>> 
>> select * from t1 where cast(char_column as time) = "12:23:24.123456";
>> 
>> which works with MySQL but not with this patch.

Sergei> Looking at how Oracle implements this feature I can see that cast()
Sergei> there works exactly as in my patch. If I do what you want, MariaDB won't
Sergei> be compatible with MySQL in the future.

It's clear that MySQL removes microseconds becasue they don't have
real support for microseconds.

As we now can have microseconds in datetime, we need to keep it.
I don't think that anyone would accept that

cast(cast(datetime as string) as datetime)

Should give a loss of precision.

Compare to:
cast(cast(5.5 as char(5)) as double) -> 5.5  (not 5)

In this case I prefer to be compatible with older versions of MySQL
than with newer versions that are not yet out.

>> > === modified file 'mysql-test/r/func_time.result'
>> > --- mysql-test/r/func_time.result	2010-10-31 16:04:38 +0000
>> > +++ mysql-test/r/func_time.result	2011-03-19 14:49:36 +0000
>> 
>> As part of this, I noticed that from_unixtime() and unix_timestamp()
>> is not enhanced to support microseconds.
>> 
>> This needs to work:
>> 
>> create table t2 (a int, b timestamp(6))
>> insert into t2 (a) values(1);
>> select * from t2;
>> I got '1 | 2011-03-29 15:38:53.395516'
>> 
>> select unix_timestamp(b) from t2; -> 1301402333 (no decimals)
>> 
>> This needs to returns sub seconds!

Sergei> If I do that, unix_timestamp() will no longer return INT_RESULT.
Sergei> Which means it cannot be used as a partitioning function anymore.
Sergei> Few tests use it that way. I've fixed the tests by wrapping
Sergei> unix_timestamp() in cast(... as unsigned). Which failed, because
Sergei> cast cannot be used as a partitioning function either. I fixed that too.

Sergei> Now tests pass. But this is an incompatible change, it may break
Sergei> existing applications. And fixing SQL will make it backward
Sergei> incompatible - partitioning by unix_timestamp() will not work in newer
Sergei> MariaDB, partitioning by cast(unix_timestamp(...) as unsigned) will
Sergei> work, but it will not work in MySQL and older MariaDB.

Sergei> At the moment I've shelved this change and I'm not committing it, until
Sergei> we decide what to do.

How about then adding:

Add unix_high_precision_timestamp(b) that returns a decimal ?

I would be fine with that.

By the way, good that you fixed that cast is a partition function!

>> > === modified file 'mysql-test/r/ps.result'
>> > --- mysql-test/r/ps.result	2010-10-04 08:51:26 +0000
>> > +++ mysql-test/r/ps.result	2011-03-01 12:24:36 +0000
>> > @@ -3040,3 +3040,18 @@ id	select_type	table	type	possible_keys
>> >  DEALLOCATE PREPARE stmt;
>> >  DROP TABLE t1;
>> >  End of 5.1 tests.
>> > +prepare stmt from "select date('2010-10-10') between '2010-09-09' and ?";
>> > +set @a='2010-11-11';
>> > +execute stmt using @a;
>> > +date('2010-10-10') between '2010-09-09' and ?
>> > +1
>> > +execute stmt using @a;
>> > +date('2010-10-10') between '2010-09-09' and ?
>> > +1
>> > +set @a='2010-08-08';
>> > +execute stmt using @a;
>> > +date('2010-10-10') between '2010-09-09' and ?
>> > +0
>> > +execute stmt using @a;
>> > +date('2010-10-10') between '2010-09-09' and ?
>> > +0
>> 
>> Please add a test that uses sub seconds (for completeness)
>> 
>> set @a='2010-08-08 02:03:04.567890';
>> execute stmt using @a;

Sergei> it's not a completeness, but a bloat. First this is a test for dates.
Sergei> second - the comment says "restoring of the Item tree in BETWEEN with
Sergei> dates".  this code path does not check for second_part > 0, it is not
Sergei> affected by microsecond presence at all.

No, this test file is for prepared statements, not for dates.

I would like to be sure that we also do some test with microseconds
and prepared statements and this is the test file where to do it.

I don't remember any tests where you tested prepare stmt and
microseconds. Did you add such a test somewhere else?

<cut>

>> > +++ mysql-test/suite/funcs_1/r/is_columns_innodb.result	2011-03-24 14:55:52 +0000
>> > @@ -382,333 +382,333 @@ LOAD DATA INFILE '<MYSQLTEST_VARDIR>/std
>> >  SELECT * FROM information_schema.columns
>> >  WHERE table_schema LIKE 'test%'
>> >  ORDER BY table_schema, table_name, column_name;
>> > -TABLE_CATALOG	TABLE_SCHEMA	TABLE_NAME	COLUMN_NAME	ORDINAL_POSITION	COLUMN_DEFAULT	IS_NULLABLE	DATA_TYPE	CHARACTER_MAXIMUM_LENGTH	CHARACTER_OCTET_LENGTH	NUMERIC_PRECISION	NUMERIC_SCALE	CHARACTER_SET_NAME	COLLATION_NAME	COLUMN_TYPE	COLUMN_KEY	EXTRA	PRIVILEGES	COLUMN_COMMENT
>> > -NULL	test	t1	f1	1	NULL	YES	char	20	20	NULL	NULL	latin1	latin1_swedish_ci	char(20)			select,insert,update,references	
>> ... lots of changes
>> 
>> How about changing the SELECT * FROM information_schema.columns to
>> just list the old columns in suite/funcs_1/datadict/columns.inc
>> 
>> This gives a smaller change list and will not cause a diff if we add
>> another column...
>> Especially this change caused really huge diff and will likely cause
>> merge conflicts sooner or later...

Sergei> Changing will also cause merge conflicts.

This will keep the code similar to MySQL and thus give us less
conflicts for anything they change!

Sergei> And I think it's good to see all columns here.

Then make a separate test of that that is not run by 10x engines and
return LOTS of rows.

<cut>

>> > === modified file 'sql/event_queue.cc'
>> > --- sql/event_queue.cc	2008-05-09 07:43:02 +0000
>> > +++ sql/event_queue.cc	2011-03-01 12:24:36 +0000
>> > @@ -513,9 +513,10 @@ Event_queue::empty_queue()
>> >  */
>> >  
>> >  void
>> > +Event_queue::dbug_dump_queue(my_time_t when)
>> >  {
>> >  #ifndef DBUG_OFF
>> > +  my_time_t now= when;
>> >    Event_queue_element *et;
>> >    uint i;
>> >    DBUG_ENTER("Event_queue::dbug_dump_queue");
>> > @@ -592,14 +593,13 @@ Event_queue::get_top_for_execution_if_ti
>> >      thd->set_current_time(); /* Get current time */
>> >  
>> >      next_activation_at= top->execute_at;
>> > +    if (next_activation_at >thd->query_start())
>> >      {
>> >        /*
>> >          Not yet time for top event, wait on condition with
>> >          time or until signaled. Release LOCK_queue while waiting.
>> >        */
>> > -      struct timespec top_time;
>> > -      set_timespec(top_time, next_activation_at - thd->query_start());
>> > +      struct timespec top_time= { next_activation_at, 0 };
>> 
>> Not sure if the above change is portable. That is why we have the
>> set_timespec() defines.
>> 
>> Any particular reason why you did the change ?

<cut>

Sergei> I'll fix it when merging in 5.3, as there we have different set of
Sergei> set_timespec* macros, and there I can set the wakeup time directly.

In other words, I have to do it (as I am the one to merge to 5.3) ;)

>> > === modified file 'sql/field.cc'
>> 
>> > @@ -3151,13 +3141,9 @@ int Field_short::store(const char *from,
>> >    
>> >    error= get_int(cs, from, len, &rnd, UINT_MAX16, INT_MIN16, INT_MAX16);
>> >    store_tmp= unsigned_flag ? (int) (ulonglong) rnd : (int) rnd;
>> > -#ifdef WORDS_BIGENDIAN
>> > -  if (table->s->db_low_byte_first)
>> > -  {
>> > +  if (ARCH_BIGENDIAN && table->s->db_low_byte_first)
>> >      int2store(ptr, store_tmp);
>> > -  }
>> 
<cut>

Sergei> As we agreed on IRC, I won't revert this change, but instead remove
Sergei> ARCH_BIGENDIAN and db_low_byte_first in a separate patch. Will push both
Sergei> together.

Good!

>> > @@ -4822,136 +4682,136 @@ timestamp_auto_set_type Field_timestamp:
>> >    }
>> >  }
>> >  
>> > +long Field_timestamp::get_timestamp(ulong *sec_part) const
>> > +{
>> > +  ASSERT_COLUMN_MARKED_FOR_READ;
>> > +  *sec_part= 0;
>> > +  if (ARCH_BIGENDIAN && table && table->s->db_low_byte_first)
>> > +    return sint4korr(ptr);
>> > +  long tmp;
>> > +  longget(tmp,ptr);
>> > +  return tmp;
>> > +}
>> >  
>> Shouldn't this be of type 'my_time_t' for completeness?

Sergei> I think 'long' is kind of safer, as these types use longget() anyway,
Sergei> and beeing explicit about the type would help to avoid unreasonable
Sergei> assumptions and expectations.

Using long will tempt users that calls this function to use long as
the variable to store the value, instead of my_time_t that they should
use.

I don't think that any caller of get_timestamp() should know how the
timestamp is stored.  If we ever change the representation, it will be
less changes if things are right from the start.

<cut>

>> > -void Field_timestamp::set_time()
>> > +int Field_timestamp::set_time()
>> >  {
>> >    THD *thd= table ? table->in_use : current_thd;
>> > -  long tmp= (long) thd->query_start();
>> >    set_notnull();
>> > -  store_timestamp(tmp);
>> > +  store_TIME(thd->query_start(), 0);
>> > +  return 0;
>> >  }
>> 
>> The set_time() function could be a void as it can never fail.
>> (Yes, I agree that it means that we have to test type of field in
>> event code, but that something we should do at event startup anyway...)

Sergei> I may do it after a merge - I've seen that Alik added this test to
Sergei> MySQL.

In other words, I have to do this when I do the merge...

>> > +static void store_native(ulonglong num, uchar *to, uint bytes)
>> >  {
>> > +  switch(bytes) {
>> > +  case 1: *to= (uchar)num;              break;
>> > +  case 2: shortstore(to, (ushort)num);  break;
>> > +  case 3: int3store(to, num); /* Sic!*/ break;
>> > +  case 4: longstore(to, (ulong)num);    break;
>> > +  case 8: longlongstore(to, num);       break;
>> > +  default: DBUG_ASSERT(0);
>> > +  }
>> > +}
>> 
>> What would be even better if each field would have a pointers to c
>> functions with the right read and write functions; Would avoid a
>> switch for every call...

Sergei> Mats Kindal benchmarked and blogged about it.
Sergei> switch is about 20% slower than function pointers if the function is
Sergei> big. But it is several times faster than function pointers if the
Sergei> function (with the switch) is small and can be inlined.

If 'bytes' would be a constant, I would agree with you
Here it can't be efficently inlined as the caller don't use a constant
for bytes.

Inline:in any function (especially one that has jumps) that generates
> 60 bytes many times will make the code slower, now faster.
(Takes more memory in the cpu, more memory for jump predictions etc).

Having one function that is called by many threads uses less CPU
memory and is becasue of this usually faster then inline (when inline
can't remove code because of usage of constants)



>> > +  if (!tmp)
>> > +    return fuzzydate & TIME_NO_ZERO_DATE;
>> 
>> should be:
>> 
>> > +    return (fuzzydate & TIME_NO_ZERO_DATE) != 0;
>> 
>> as the function is a bool function

Sergei> no need to because the function is a bool function. If it were my_bool,
Sergei> then it would be needed.

Please add as otherwise we may get compiler warnings for loss of
precession when converting to bool.

<cut>

>> > @@ -7048,6 +7101,8 @@ Item_cache* Item_cache::get_cache(const
>> >      return new Item_cache_str(item);
>> >    case ROW_RESULT:
>> >      return new Item_cache_row();
>> > +  case TIME_RESULT:
>> > +    return new Item_cache_int(MYSQL_TYPE_DATETIME);
>> 
>> Shouldn't this be a double or decimal as the time can contain microseconds?
>> 
>> Looks like a potential bug!

Sergei> No :)
Sergei> It's an integer that stores packed datetime value.
Sergei> This is the value we pass around and compare with.

Then I assume you added a comment about this...

>> > === modified file 'sql/item.h'

<cut>

>> I also noticed that you have a Field->cmp_type() but there is no
>> Item_field->cmp_type() mapped to Field->cmp_type().
>> As I assume that for all fields the following should hold:
>> 
Item_field-> cmp_type() == Item_field->field->cmp_type()

Sergei> Yes.

Then I assume you have now added Item_field->cmp_type() as above.


>> > @@ -3031,7 +3022,7 @@ class Item_cache_int: public Item_cache
>> >  protected:
>> >    longlong value;
>> >  public:
>> > -  Item_cache_int(): Item_cache(),
>> > +  Item_cache_int(): Item_cache(MYSQL_TYPE_LONGLONG),
>> >      value(0) {}
>> >    Item_cache_int(enum_field_types field_type_arg):
>> >      Item_cache(field_type_arg), value(0) {}
>> > @@ -3043,8 +3034,13 @@ class Item_cache_int: public Item_cache
>> >    String* val_str(String *str);
>> >    my_decimal *val_decimal(my_decimal *);
>> >    enum Item_result result_type() const { return INT_RESULT; }
>> > -  bool result_as_longlong() { return TRUE; }
>> >    bool cache_value();
>> > +  Item *clone_item()
>> > +  {
>> > +    Item_cache_int *item= new Item_cache_int(cached_field_type);
>> > +    item->store(this, value);
>> > +    return item;
>> > +  }
>> >  };
>> 
>> Don't understand why Item_cache_int() needs a clone_item when
>> Item_cache_real() or Item_cache_decimal() dosn't have one.
>> 
>> If this was needed, please check if the other Item_cache_xxx()
>> functions needs it.

Sergei> It only needed it because of this special usage, as a holder of packed
Sergei> datetime value. get_datetime_value() creates it to cache converted
Sergei> constant datetime. The existence of clone() method tells the server that
Sergei> this item is a proper constant and there's no need to optimize it
Sergei> further. Other Item_cache_* objects are never used like that.

Thanks for the explanation. Please add it as a comment to clone_item().

>> > === modified file 'sql/item_cmpfunc.cc'
>> <cut>
>> 
>> > @@ -1026,9 +761,14 @@ Item** Arg_comparator::cache_converted_c
>> >                                                  Item **cache_item,
>> >                                                  Item_result type)
>> >  {
>> > -  /* Don't need cache if doing context analysis only. */
>> > +  /*
>> > +    Don't need cache if doing context analysis only.
>> > +    Also, get_datetime_value creates Item_cache internally.
>> > +    Unless fixed, we should not do it here.
>> > +  */
>> >    if (!thd_arg->is_context_analysis_only() &&
>> > -      (*value)->const_item() && type != (*value)->result_type())
>> > +      (*value)->const_item() && type != (*value)->result_type() &&
>> > +      type != TIME_RESULT)
>> >    {
>> >      Item_cache *cache= Item_cache::get_cache(*value, type);
>> >      cache->setup(*value);
>> 
>> Why don't we cache it there instead of doing it in get_datetime_value()?
>> This would save us one if and 3 test for every call to
>> get_datetime_value().
>> 
>> You could here call get_datetime_value() and use this value in the
>> cache.  If this doesn't easily work (as indicated on IRC), at least
>> expand the comment to explain the issue so that we can later try to
>> fix it in a better way.
>> 
>> > +      if (args[i]->cmp_type() == TIME_RESULT)
>> >        {
>> > -        datetime_found= TRUE;
>> > +        if (args[i]->field_type() != MYSQL_TYPE_TIME ||
>> > +            (args[i]->field_type() == MYSQL_TYPE_TIME && compare_as_dates==0))
>> > +            compare_as_dates= args[i];
>> >          continue;
>> >        }
>> >      }
>> >    }
>> 
>> I find it strange that you set compare_as_dates to the last not
>> MYSQL_TYPE_TIME column as this is used to give warnings as the
>> warning may in theory come from some other field.
>> I tried to make a test case to show the problem but didn't succeed :(

Sergei> Yes, the warning may come from some other field. But 'compare_as_dates'
Sergei> field affects how we will interpret that other field.  For example, if
Sergei> you compare a string and a time - the string will be parsed as a time
Sergei> value, and the warning will say "incorrect TIME value".
Sergei> If you compare a string and a datetime - the warning will say "incorrect
Sergei> DATETIME value". If you compare a string, a time, and a datetime (this
Sergei> is BETWEEN, remember? Three operands), the string will be parsed as
Sergei> DATETIME.

Do we have test cases for all this?
If not, it would be good to have all combinations covered!

>> > === modified file 'sql/item_func.cc'
>> > --- sql/item_func.cc	2010-11-23 13:08:10 +0000
>> > +++ sql/item_func.cc	2011-03-19 13:54:46 +0000
>> <cut>
>> > +bool Item_func_min_max::get_date(MYSQL_TIME *ltime, uint fuzzy_date)
>> >  {
>> >    longlong UNINIT_VAR(min_max);
>> > +  DBUG_ASSERT(fixed == 1);
>> > +  if (!compare_as_dates)
>> > +    return Item_func::get_date(ltime, fuzzy_date);
>> 
>> Add a comment in which case the above may not be true.

Sergei> But it's a normal situation.
Sergei> For all val_* we consider all cases.
Sergei> You don't want me to add a similar comment for case INT_RESULT in
Sergei> val_str(), for example, do you? And for all other combinations.

Yes, but normally we do things on result_type, not by a local
variable that is set in some unknown manner.

I needed it for the above as I did not understand the code, even after
looking around for a short time. For any such code, I want a comment.

>> > === modified file 'sql/item_timefunc.cc'

>> > @@ -2050,42 +1845,14 @@ bool Item_func_from_unixtime::get_date(M
>> >  void Item_func_convert_tz::fix_length_and_dec()
>> >  {
>> >    collation.set(&my_charset_bin);
>> > -  decimals= 0;
>> > -  max_length= MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
>> > +  max_length= MAX_DATETIME_WIDTH;
>> 
>> Should probably be:
>> max_length= MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
>> 
>> This is how all the other functions are defined.

Sergei> MY_CHARSET_BIN_MB_MAXLEN is always 1. It is assumed and hardcoded in so
Sergei> many places in the code (like buffer sizes, memory allocations, using strlen,
Sergei> byte-per-byte string processing and so on) that it's practically
Sergei> impossible to change. And doing it makes no sense either.

Sergei> I've asked Bar, he said that this define (MY_CHARSET_BIN_MB_MAXLEN)
Sergei> was removed from MySQL sources some time ago. And that it's meaningless
Sergei> and should've never been added in the first place.

It's still in the current MariaDB source and I would prefer to have
everything done the same way.  To argue how things are done in the
future is not an argument for current code (If we are not doing the
change in all relevant places).

Anyway, not critical to change.

<cut>

>> For some examples where this now goes strange:
>> 
>> dayofyear(timediff('0000-00-01 00:00:05', '0000-00-01 00:00:05'))  -> NULL
>> dayofyear(timediff('0000-00-01 00:00:05', '0000-00-01 00:00:04'))  -> 0
>> dayofyear(cast('0000-00-01 00:00:05' as datetime))                 -> NULL

Sergei> Which means that dayofyear() needs to require both TIME_NO_ZERO_DATE and
Sergei> TIME_NO_ZERO_IN_DATE. I'll fix that.

Sergei> Alternatively, we can say that TIME_NO_ZERO_DATE implicitly includes
Sergei> TIME_NO_ZERO_IN_DATE.

The later is probably ok.

>> > === modified file 'sql/log_event.cc'
>> 
>> > @@ -1000,16 +1004,27 @@ class Log_event
>> >    { return 0; }
>> >    virtual bool write_data_body(IO_CACHE* file __attribute__((unused)))
>> >    { return 0; }
>> > -  inline time_t get_time()
>> > +  inline my_time_t get_time()
>> >    {
>> >      THD *tmp_thd;
>> >      if (when)
>> >        return when;
>> >      if (thd)
>> > +    {
>> > +      when= thd->start_time;
>> > +      when_sec_part= thd->start_time_sec_part;
>> > +      return when;
>> > +    }
>> 
>> thd should alway be true.  Can you at some point in time check when
>> this is not the case and if needed set thd for the class !
>> 
>> >      if ((tmp_thd= current_thd))
>> 
>> current_thd() should never fail. If it can, there should be a comment
>> in the code about this!

Sergei> This is your code :)
Sergei> http://bazaar.launchpad.net/~maria-captains/maria/5.1/revision/2476.293.2
Sergei> And there was a comment that dissapeared in that change.
Sergei> It used to be

Sergei>      THD *thd= current_thd;
Sergei>      /* thd will only be 0 here at time of log creation */
Sergei>      now= thd ? thd->start_time : my_time(0);

And in my code, there was a comment, which was what I requested ;)

I assume you added it back

>> > === modified file 'sql/mysql_priv.h'

<cut>

>> > @@ -3045,10 +3045,10 @@ void sys_var_microseconds::set_default(T
>> >  uchar *sys_var_microseconds::value_ptr(THD *thd, enum_var_type type,
>> >                                            LEX_STRING *base)
>> >  {
>> > -  thd->tmp_double_value= (double) ((type == OPT_GLOBAL) ?
>> > +  thd->sys_var_tmp.double_value= (double) ((type == OPT_GLOBAL) ?
>> >                                     global_system_variables.*offset :
>> >                                     thd->variables.*offset) / 1000000.0;
>> 
>> Change 100000.0 to same define constant as 1e6

Sergei> No. I have a constant for "precision of my_hrtime_t", it's 1000000,
Sergei> which means "microseconds", but it does not have to be. We can have
Sergei> my_hrtime_t storing, for example, hundreds of nanoseconds.

Sergei> I have a constant for "precision of second_part", it's also 1000000,
Sergei> which means second_part is microseconds, but it doesn't have to be
Sergei> either.

Sergei> But I won't introduce a constant for "number of microseconds in a
Sergei> second".

And any reasons the above functions is returning microseconds ?
Shouldn't this be something that is depending on the precision that we
can provide?  In other words, if we change my_hrtime_t to use
nanoseconds, isn't it likely that sys_var_microseconds needs to be
renamed and the above constant too ?

>> > === modified file 'sql/sql_select.cc'
>> 
>> > @@ -9455,7 +9445,7 @@ test_if_equality_guarantees_uniqueness(I
>> >  {
>> >    return r->const_item() &&
>> >      /* elements must be compared as dates */
>> > -     (Arg_comparator::can_compare_as_dates(l, r, 0) ||
>> > +     (l->cmp_type() == TIME_RESULT ||
>> >        /* or of the same result type */
>> >        (r->result_type() == l->result_type() &&
>> >         /* and must have the same collation if compared as strings */
>> 
>> Add a comment:
>> 
>> /*
>> It's enough to test 'l' for TIME_RESULT as only 'l' can be a field
>> in this context.
>> */

Sergei> No, this was not why I changed it.
Sergei> It was, in fact, a bug. Cases where 'r' had TIME_RESULT, but 'l' did not
Sergei> have it, may cause test_if_equality_guarantees_uniqueness() to return
Sergei> false positives. Like:

Sergei>   create table t1 (a varchar(100));
Sergei>   insert t1 values ('2010-01-02'), ('2010-1-2'), ('20100102');
Sergei>   select distinct t1 from t1 where a=date('2010-01-02');

Sergei> This fails in 5.1 and works correctly in 5.1-micro.

Still, the code needs a comment why we only test for 'l' and not for
'r'.

Regards,
Monty



Follow ups

References