← Back to team overview

maria-developers team mailing list archive

Re: [GSoC] self-tuning optimizer

 

Hi, Anshu!

On May 25, Anshu Avinash wrote:
> Hi all,
> 
> You can find my this week's blog entry at
> http://igniting.in/gsoc2014/2014/05/25/coding-things-up/ . I have created a
> branch on launchpad for my work:
> http://bazaar.launchpad.net/~igniting/maria/maria/revision/4211 . You can
> give your suggestions/reviews either on this thread or as a comment on the
> blog itself.

I've got used to launchpad, where one can subscribe to a branch and get
all pushes by email - so that I could simply reply to these emails.

It doesn't seem to work this way on guthub :(

Anyway...

> diff --git a/mysql-test/t/costmodel.test b/mysql-test/t/costmodel.test
> --- /dev/null
> +++ b/mysql-test/t/costmodel.test
> @@ -0,0 +1,9 @@
> +--disable_warnings
> +DROP TABLE IF EXISTS t1;
> +--enable_warnings
> +
> +CREATE TABLE t1 (a INT);
> +INSERT INTO t1 VALUES (1);
> +SELECT * FROM t1;

This test doesn't seem to do anything with your changes in the code.
it doesn't test new constants, that you've added.

> +
> +DROP TABLE t1;
> diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql
> --- a/scripts/mysql_system_tables.sql
> +++ b/scripts/mysql_system_tables.sql
> @@ -229,3 +229,10 @@ CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name
>  -- we avoid mixed-engine transactions.
>  set storage_engine=@orig_storage_engine;
>  CREATE TABLE IF NOT EXISTS gtid_slave_pos (domain_id INT UNSIGNED NOT NULL, sub_id BIGINT UNSIGNED NOT NULL, server_id INT UNSIGNED NOT NULL, seq_no BIGINT UNSIGNED NOT NULL, PRIMARY KEY (domain_id, sub_id)) comment='Replication slave GTID position';
> +
> +-- Tables for Self Tuning Cost Optimizer
> +
> +CREATE TABLE IF NOT EXISTS all_constants (const_name varchar(64) NOT NULL, const_value double NOT NULL, PRIMARY KEY (const_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Constants for optimizer';

Uhm... I don't particularly like the "all_constants" name,
we don't have other tables or data structures that are called like that.

What about "optimizer_cost_factors" ?
Same applies to the code changes below
(although, in the code it could be simply "cost_factors", for brevity)

> +
> +-- Remember for later if all_constants table already existed
> +set @had_all_constants_table= @@warning_count != 0;
> diff --git a/sql/opt_costmodel.h b/sql/opt_costmodel.h
> --- /dev/null
> +++ b/sql/opt_costmodel.h
> @@ -0,0 +1,15 @@
> +/* Interface to get constants */
> +
> +#ifndef _opt_costmodel_h

SQL_OPT_COSTMODEL_INCLUDED please, not _opt_costmodel_h

> +#define _opt_costmodel_h
> +
> +enum enum_all_constants_col
> +{
> +  ALL_CONSTANTS_CONST_NAME,
> +  ALL_CONSTANTS_CONST_VALUE
> +};

You can move the enum into the opt_costmodel.cc

This opt_costmodel.h file is the interface - other source files includes
it to get access to your public API. But you don't want them to read
your table directly, so they don't need to know how the fields are mapped.

I mean, this enum is part of the implementation, your internal stuff,
not the public API that others need to see or care about.

> +
> +double get_read_time_factor(THD *thd);
> +double get_scan_time_factor(THD *thd);

On the other hand, these two are API functions all right.

> +
> +#endif /* _opt_costmodel_h */
> diff --git a/sql/opt_costmodel.cc b/sql/opt_costmodel.cc
> --- /dev/null
> +++ b/sql/opt_costmodel.cc
> @@ -0,0 +1,178 @@
> +#include "sql_base.h"
> +#include "key.h"
> +#include "opt_costmodel.h"
...
> +static
> +inline int open_table(THD *thd, TABLE_LIST *table,
> +                      Open_tables_backup *backup,
> +                      bool for_write)
> +{
> +  init_table_list(table, for_write);
> +  init_mdl_requests(table);

better use table->init_one_table() instead

> +  return open_system_tables_for_read(thd, table, backup);
> +}
> +
> +class All_constants
> +{
> +private:
> +  /* Handler used for retrieval of all_constants table */
> +  handler *all_constants_file;
> +  /* Table to read constants from or to update/delete */
> +  TABLE *all_constants_table;
> +  /* Length of the key to access all_constants table */
> +  uint key_length;
> +  /* Number of the keys to access all_constants table */
> +  uint key_idx;
> +  /* Structure for the index to access all_constants table */
> +  KEY *key_info;
> +  /* Record buffers used to access/update all_constants table */
> +  uchar *record[2];

record buffers are in the TABLE already, no need to duplicate them here

> +
> +  LEX_STRING const_name;
> +
> +  Field *const_name_field;
> +  Field *const_value_field;
> +
> +  double const_value;
> +
> +public:
> +  All_constants(TABLE *tab, LEX_STRING name)
> +    :all_constants_table(tab), const_name(name)
> +  {
> +    all_constants_file= all_constants_table->file;
> +    /* all_constants table has only one key */
> +    key_idx= 0;
> +    key_info= &all_constants_table->key_info[key_idx];
> +    key_length= key_info->key_length;
> +    record[0]= all_constants_table->record[0];
> +    record[1]= all_constants_table->record[1];
> +    const_name_field= all_constants_table->field[ALL_CONSTANTS_CONST_NAME];
> +    const_value_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE];

Why are you copying all this from the TABLE into your class?

> +    const_value= 1.0;
> +  }
> +
> +  /** 
> +    @brief
> +    Set the key fields for the all_constants table
> +
> +    @details
> +    The function sets the value of the field const_name
> +    in the record buffer for the table all_constants.
> +    This field is the primary key for the table.
> +
> +    @note
> +    The function is supposed to be called before any use of the  
> +    method find_const for an object of the All_constants class. 
> +  */
> +
> +  void set_key_fields()
> +  {
> +    const_name_field->store(const_name.str, const_name.length, system_charset_info);
> +  }
> +
> +  /**
> +    @brief
> +    Find a record in the all_constants table by a primary key
> +
> +    @details
> +    The function looks for a record in all_constants by its primary key.
> +    It assumes that the key fields have been already stored in the record 
> +    buffer of all_constants table.
> +
> +    @retval
> +    FALSE     the record is not found
> +    @retval
> +    TRUE      the record is found
> +  */
> +
> +  bool find_const()
> +  {
> +    uchar key[MAX_KEY_LENGTH];
> +    key_copy(key, record[0], key_info, key_length);
> +    return !all_constants_file->ha_index_read_idx_map(record[0], key_idx, key,
> +                                                      HA_WHOLE_KEY, HA_READ_KEY_EXACT);
> +  }
> +
> +  void read_const_value()
> +  {
> +    if (find_const())
> +    {
> +      Field *const_field= all_constants_table->field[ALL_CONSTANTS_CONST_VALUE];
> +      if(!const_field->is_null())
> +      {
> +        const_value= const_field->val_real();
> +      }
> +    }
> +  }
> +
> +  double get_const_value()
> +  {
> +    return const_value;
> +  }
> +
> +  ~All_constants() {}
> +};
> +
> +static double read_constant_from_table(THD *thd, const LEX_STRING const_name)
> +{
> +  TABLE_LIST table_list;
> +  Open_tables_backup open_tables_backup;
> +
> +  if (open_table(thd, &table_list, &open_tables_backup, FALSE))
> +  {
> +    thd->clear_error();
> +    return 1.0;
> +  }
> +
> +  All_constants all_constants(table_list.table, const_name);
> +  all_constants.set_key_fields();
> +  all_constants.read_const_value();
> +
> +  close_system_tables(thd, &open_tables_backup);
> +
> +  return all_constants.get_const_value();
> +}
> +
> +double get_read_time_factor(THD *thd)
> +{
> +  return read_constant_from_table(thd, read_time_factor);
> +}
> +
> +double get_scan_time_factor(THD *thd)
> +{
> +  return read_constant_from_table(thd, scan_time_factor);
> +}

No-no-no. *Every* time when an optimizer needs a constant you
open the table, search in the index, and close the table!
That's many times per query. Per *every* query!

What you need to do is add some kind of an initialization function, for
example

   init_cost_factors() or
   Cost_factors::init()  (if everything is in the Cost_factors class)

in this function/method you open the table, read everything in memory,
close the table. And then you never touch the table.

You only access constants from memory, like

  inline double Cost_factors::scan_time() { return scan_time; }
  inline double Cost_factors::read_time() { return read_time; }

At least until you implement the code that collects statistics and updates
cost factors accordingly.

Regards,
Sergei


Follow ups

References