maria-developers team mailing list archive

Thread
Date
Re: 52f489ebccb: MDEV-29069 follow-up: support partially suitable keys

To: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Tue, 18 Oct 2022 00:00:57 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CAOByAQUnbHiXkOiOiV73aCixVn-8i+gYk4FOpKkhMyBTAKK2OQ@mail.gmail.com>
Hi, Nikita,

On Oct 17, Nikita Malyavin wrote:
> Hello Sergei!
> 
> On Tue, 4 Oct 2022 at 21:02, Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
> 
> > Hi, Nikita,
> >
> > This review applies to the combined diff e2f8dff^..52f489e
> >
> > On Oct 04, Nikita Malyavin wrote:
> >
> > > diff --git a/sql/log_event.h b/sql/log_event.h
> > > index 91a9eeee10f..8452e665c98 100644
> > > --- a/sql/log_event.h
> > > +++ b/sql/log_event.h
> > > @@ -5275,8 +5278,11 @@ class Rows_log_event : public Log_event
> > >    uchar    *m_key;      /* Buffer to keep key value during searches */
> > >    KEY      *m_key_info; /* Pointer to KEY info for m_key_nr */
> > >    uint      m_key_nr;   /* Key number */
> > > +  uint      m_key_parts_suit; /* A number of key_parts suited to lookup
> > */
> >
> > that's very confusing name. Took me a while to understand what you meant.
> > Better call it m_usable_key_parts
> >
> > >    bool master_had_triggers;     /* set after tables opening */
> > >
> > > +  uint key_parts_suit_event(const KEY *key) const;
> >
> > and this could be "get_usable_key_parts()"
> 
> Okay, no problem. Both sound suitable, or usable, for me:)

and is similar to KEY::usable_key_parts

> > > diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc
> > > index 422d496d5c3..25705d13641 100644
> > > --- a/sql/log_event_server.cc
> > > +++ b/sql/log_event_server.cc
> > > @@ -6062,7 +6065,7 @@ int Rows_log_event::do_apply_event(rpl_group_info *rgi)
> > >      */
> > >      sql_mode_t saved_sql_mode= thd->variables.sql_mode;
> > >      if (!is_auto_inc_in_extra_columns())
> > > -      thd->variables.sql_mode= (rpl_data.is_online_alter() ? saved_sql_mode :0)
> > > +      thd->variables.sql_mode= (m_online_alter ? saved_sql_mode :0)
> > >                                 | MODE_NO_AUTO_VALUE_ON_ZERO;
> >
> > wouldn't it be more correct to use saved_sql_mode also in normal
> > replication?
> 
> using MODE_NO_AUTO_VALUE_ON_ZERO was a fix to Bug #56662  Assertion failed:
> next_insert_id == 0, file .\handler.cc.
> See commit d5bf6b8aa8bb

No, I didn't mean to ask "why MODE_NO_AUTO_VALUE_ON_ZERO", I meant to
ask why not

  thd->variables.sql_mode= saved_sql_mode | MODE_NO_AUTO_VALUE_ON_ZERO;

that is why not "to use saved_sql_mode also in normal replication"

> > > +  uint p= 0;
> > > +  for (;p < key->ext_key_parts; p++)
> > > +  {
> > > +    Field *f= key->key_part[p].field;
> > > +    uint non_deterministic_default= f->default_value &&
> > > +                     f->default_value->flags | VCOL_NOT_STRICTLY_DETERMINISTIC;
> > > +
> > > +    int next_number_field= f->table->next_number_field ?
> > > +                           f->table->next_number_field->field_index : -1;
> >
> > eh?
> > 1. why it's *in the loop* ?
> > 2. why not simply
> >
> >       || f == f->table->next_number_field
> >
> >    or, to be extra safe
> >
> >       || f->table->field[f->field_index] == f->table->next_number_field
> >
> Thanks!
> Turns out there's no need in extra safety, it just works.

by "extra safety" I meant the case when

  f != f->table->field[f->field_index]

(here f= key->key_part[p].field). This is the case for varchar/blob
prefix keys. But as they cannot be auto-increment anyway, the direct
comparison should work, I suppose.

> > > +
> > > +    if (f->field_index >= master_columns
> >
> > I don't think this condition ^^^ is needed. Same as elsewhere,
> > if f->field_index < master_columns but !bitmap_is_set,
> > does it mean the keyseg is ok? I don't think so.
> >
> It *is* needed.
> 
> An extra field with default can be used in the key lookup. A usual
> field should have an explicit value. If it's not, it means it's not
> sent (MINIMAL/NOBLOB) and we don't know what real value is

I don't understand how this answers my question. Let me rephrase.
What will break if you remove the condition

  f->field_index >= master_columns &&

?

> > > +        && !bitmap_is_set(&m_table->has_value_set, f->field_index)
> > > +        && (non_deterministic_default || next_number_field == f->field_index))
> > > +      break;
> > > +  }
> > > +  return p;
> > > +}
> > >
> > >  /**
> > >    Find the best key to use when locating the row in @c find_row().
> > > @@ -8024,13 +8077,26 @@ static bool record_compare(TABLE *table)
> > >  */
> > >  int Rows_log_event::find_key()
> > >  {
> > > -  uint i, best_key_nr, last_part;
> > > +  uint i, best_key_nr;
> > >    KEY *key, *UNINIT_VAR(best_key);
> > >    ulong UNINIT_VAR(best_rec_per_key), tmp;
> > >    DBUG_ENTER("Rows_log_event::find_key");
> > >    DBUG_ASSERT(m_table);
> > >
> > > +  #ifdef DBUG_TRACE
> >
> > #ifndef DBUG_OFF
> >
> Pity we have a double negation in similar places, I tried to avoid it and
> came across that macro

Yes, `#ifndef DBUG_OFF` is historical, and unfortunately this is how it
should be checked. DBUG_TRACE is something Marko added recently and it's
when you want to enable DBUG_PRINT, as far as I remember, but disable
the rest of DBUG_* macros. You have DBUG_EXECUTE_IF, so it shouldn't be
enabled by DBUG_TRACE.

> > > +  auto _ = make_scope_exit([this]() {
> > > +    DBUG_EXECUTE_IF("rpl_report_chosen_key",
> > > +                    push_warning_printf(m_table->in_use,
> > > +                                        Sql_condition::WARN_LEVEL_NOTE,
> > > +                                        ER_UNKNOWN_ERROR, "Key chosen: %d",
> > > +                                        m_key_nr == MAX_KEY ?
> > > +                                        -1 : m_key_nr););
> > > +  });
> > > +  #endif
> > > +
> > > +  m_key_nr= MAX_KEY;
> > >    best_key_nr= MAX_KEY;
> > > +  uint parts_suit= 0;
> > >
> > > @@ -8041,33 +8107,38 @@ int Rows_log_event::find_key()
> > >    {
> > >      if (!m_table->s->keys_in_use.is_set(i))
> > >        continue;
> > > +    parts_suit= key_parts_suit_event(key);
> > > +    if (!parts_suit)
> > > +      continue;
> > >      /*
> > >        We cannot use a unique key with NULL-able columns to uniquely identify
> > >        a row (but we can still select it for range scan below if nothing better
> > >        is available).
> > >      */
> > > -    if ((key->flags & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME)
> > > +    if ((key->flags & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME &&
> > > +        parts_suit == key->user_defined_key_parts)
> >
> > this doesn't make use of ext key parts. Should be something like
> >
> >   if (parts_suit == key->ext_key_parts ||
> >       (parts_suit >= key->user_defined_key_parts && (key->flags & ....)))
>
> why not just use ext_key_parts (which I'm using now btw)?

Now, you mean, in the next patch? I'll check it out.

The logic of my if() above is "either it uses the extended key (which is
always truly unique) - or it uses only user specified key parts and has
HA_NOSAME but no HA_NULL_PART_KEY".

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups

Re: 52f489ebccb: MDEV-29069 follow-up: support partially suitable keys
From: Nikita Malyavin, 2022-10-17
References

Re: 52f489ebccb: MDEV-29069 follow-up: support partially suitable keys
From: Sergei Golubchik, 2022-10-04
Re: 52f489ebccb: MDEV-29069 follow-up: support partially suitable keys
From: Nikita Malyavin, 2022-10-17