maria-developers team mailing list archive

Thread
Date
Re: 0f47171: MDEV-14005 Remove need for Partition Key to be part of Primary Key.

To: maria-developers@xxxxxxxxxxxxxxxxxxx
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Wed, 6 Dec 2017 11:04:20 +0100
Cc: holyfoot@xxxxxxxxxxx
In-reply-to: <20171204134647.24F39140D94@nebo.localdomain>
User-agent: Mutt/1.7.2 (2016-11-26)
Hi, Alexey!

On Dec 04, Alexey Botchkov wrote:
> revision-id: 0f4717118e135c178fc1b12065e33ac8a5e21d0b (mariadb-10.3.2-82-g0f47171)
> parent(s): ddac2d7a1ef725361c7e83b819cb0bec72db024a
> committer: Alexey Botchkov
> timestamp: 2017-12-04 17:45:13 +0400
> message:
> 
> MDEV-14005 Remove need for Partition Key to be part of Primary Key.
> 
>         Now we check all partitions for the unique keys if necessary
>         in ha_partition::write_row and ha_partition::update_row.
> 
> diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> index 6bebf5d..12a6287 100644
> --- a/sql/ha_partition.cc
> +++ b/sql/ha_partition.cc
> @@ -372,6 +372,7 @@ void ha_partition::init_handler_variables()
>    part_share= NULL;
>    m_new_partitions_share_refs.empty();
>    m_part_ids_sorted_by_num_of_records= NULL;
> +  m_cu_errkey= (uint) -1;
>  
>  #ifdef DONT_HAVE_TO_BE_INITALIZED
>    m_start_key.flag= 0;
> @@ -4088,6 +4089,79 @@ void ha_partition::try_semi_consistent_read(bool yes)
>  }
>  
>  
> +int ha_partition::check_files_for_key(uchar *key, int n_key,
> +                                      int partition_to_skip, int *res)
> +{
> +//  uint num_subparts= part_info->num_subparts;
> +  for (uint n=0; n < m_part_info->num_parts; n++)
> +  {
> +    handler *f= m_file[n];
> +    int ires;
> +
> +    if ((int) n == partition_to_skip)
> +      continue;
> +
> +    f->active_index= n_key;
> +    if ((*res= f->extra(HA_EXTRA_KEYREAD)) ||
> +        (*res= f->ha_index_init(n_key, FALSE)))
> +      return 1;
> +
> +    *res= f->index_read_map(m_part_info->table->record[0],
> +        key, HA_WHOLE_KEY, HA_READ_KEY_EXACT);

use index_read_idx_map here

> +    ires= f->ha_index_end();
> +
> +    if (*res == HA_ERR_KEY_NOT_FOUND)
> +      continue;
> +
> +    if (*res == 0)
> +      *res= ires ? ires : HA_ERR_FOUND_DUPP_KEY;
> +
> +    m_last_part= n;
> +    m_cu_errkey= n_key;
> +    return 1;
> +  }
> +
> +  *res= 0;
> +  return 0;
> +}
> +
> +
> +int ha_partition::check_uniques_insert(uchar *buf,
> +                                       int partition_to_skip, int *res)
> +{
> +  uint n_key;
> +
> +  for (n_key=0; n_key < m_part_info->n_uniques_to_check; n_key++)
> +  {

can you use key_copy() here?

> +    uchar *cbuf= m_part_info->unique_key_buf[0];
> +    KEY *ckey= m_part_info->uniques_to_check[n_key];
> +    uint n;
> +
> +    for (n=0; n < ckey->user_defined_key_parts; n++)
> +    {
> +      const KEY_PART_INFO *cpart= ckey->key_part + n;
> +      uint maybe_null= MY_TEST(cpart->null_bit);
> +      Field *f= cpart->field;
> +      my_ptrdiff_t ofs= buf-table->record[0];
> +
> +      f->move_field_offset(ofs);
> +      if (maybe_null)
> +        cbuf[0]= f->is_null();
> +      f->get_key_image(cbuf+maybe_null, cpart->length, Field::itRAW);
> +      cbuf+= cpart->store_length;
> +      f->move_field_offset(-ofs);
> +    }
> +
> +    if (check_files_for_key(m_part_info->unique_key_buf[0], n_key,
> +          partition_to_skip, res))
> +      return 1;
> +  }
> +
> +  *res= 0;
> +  return 0;
> +}
> +
> +
>  /****************************************************************************
>                  MODULE change record
>  ****************************************************************************/
> @@ -4199,6 +4273,11 @@ int ha_partition::write_row(uchar * buf)
>      error= HA_ERR_NOT_IN_LOCK_PARTITIONS;
>      goto exit;
>    }
> +
> +  if (table->part_info->n_uniques_to_check &&
> +      check_uniques_insert(buf, part_id, &error))
> +    goto exit;

This would _probably_ introduce the race condition I've mentioned in a
comment to MDEV-14005.

Say, you have a table with two partitions and a unique key.

Now, two threads try to insert rows, the first thread inserts 1 into the
first partition (there are more columns in a table and a partitioning
key on those columns, but I omit these details for simplicity). And a
second thread inserts 1 into the second partition.

So, the first thread checks for 1 in the second partition, finds
nothing. Second thread checks for 1 in the first partition, finds
nothing. Then the first thread inserts 1 into the first partition,
second thread inserts 1 into the second partition.

Monty's idea was to insert optimistically - first you insert, then check
for duplicates. Not ideal, in the above example you can have both
threads getting a duplicate key error.

There could be other, e.g. locking, solutions too.

>    m_last_part= part_id;
>    DBUG_PRINT("info", ("Insert in partition %d", part_id));
>    start_part_bulk_insert(thd, part_id);

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx