← Back to team overview

maria-developers team mailing list archive

Re: open_binary_frm() decomposed an methods of TABLE_SHARE

 

Hi, Sanja!

On Oct 04, sanja@xxxxxxxxxxxx wrote:
> ------------------------------------------------------------
> revno: 3202
> revision-id: sanja@xxxxxxxxxxxx-20111004111459-1i44vpkdm1j3lisl
> parent: psergey@xxxxxxxxxxxx-20110929130743-yj0zl4hdvoydnmda
> committer: sanja@xxxxxxxxxxxx
> branch nick: work-maria-5.3-frm2
> timestamp: Tue 2011-10-04 14:14:59 +0300
> message:
>   open_binary_frm() decomposed an methods of TABLE_SHARE to allow
>   engine to create TABLE_SHARE.
>   
>   utype and geometry_type moved out of Field.

Thanks!
Here're my thoughts.

The summary is:

1. I'd like to have it simpler. It's going to be an external API for
   engines to use, after all.
2. Don't try to support all optimizations that open_binary_frm() uses,
   and old frm formats - keep that out of the API, open_binary_frm()
   can modify TABLE_SHARE directly to achieve the necessary effect.
3. Try to create a usage example - it will show the unnecessary
   complexity

See all the comments below:

> === modified file 'sql/handler.h'
> --- a/sql/handler.h	2011-07-14 13:44:37 +0000
> +++ b/sql/handler.h	2011-10-04 11:14:59 +0000
> @@ -298,6 +298,31 @@ enum legacy_db_type
>    DB_TYPE_FIRST_DYNAMIC=42,
>    DB_TYPE_DEFAULT=127 // Must be last
>  };
> +
> +/*
> +  We use three additional unireg types for TIMESTAMP to overcome limitation
> +  of current binary format of .frm file. We'd like to be able to support
> +  NOW() as default and on update value for such fields but unable to hold
> +  this info anywhere except unireg_check field. This issue will be resolved
> +  in more clean way with transition to new text based .frm format.
> +  See also comment for Field_timestamp::Field_timestamp().
> +*/
> +enum utype
> +{
> +  UT_NONE, UT_DATE, UT_SHIELD, UT_NOEMPTY, UT_CASEUP, UT_PNR,
> +  UT_BGNR, UT_PGNR, UT_YES, UT_NO, UT_REL,
> +  UT_CHECK, UT_EMPTY, UT_UNKNOWN_FIELD, UT_CASEDN,
> +  UT_NEXT_NUMBER, UT_INTERVAL_FIELD,
> +  UT_BIT_FIELD, UT_TIMESTAMP_OLD_FIELD, UT_CAPITALIZE,
> +  UT_BLOB_FIELD, UT_TIMESTAMP_DN_FIELD, UT_TIMESTAMP_UN_FIELD,
> +  UT_TIMESTAMP_DNUN_FIELD
> +};

Why wouldn't you remove all unused values from utype enum?

> +enum geometry_type
> +{
> +  GEOM_GEOMETRY = 0, GEOM_POINT = 1, GEOM_LINESTRING = 2, GEOM_POLYGON = 3,
> +  GEOM_MULTIPOINT = 4, GEOM_MULTILINESTRING = 5, GEOM_MULTIPOLYGON = 6,
> +  GEOM_GEOMETRYCOLLECTION = 7
> +};
>  /*
>    Better name for DB_TYPE_UNKNOWN. Should be used for engines that do not have
>    a hard-coded type value here.
> === modified file 'sql/table.cc'
> --- a/sql/table.cc	2011-09-02 12:10:10 +0000
> +++ b/sql/table.cc	2011-10-04 11:14:59 +0000
> @@ -675,6 +684,797 @@ err_not_open:
>  }
>  
>  
> +/**
> +  Set some general table parameters (should be called first after the
> +  TABLE SHARE initialization)
> +
> +  @param frm_ver         Version of the frm (FRM_VER_TRUE_VARCHAR)

no need to pass it as a parameter, set frm version always to the latest one.
if needed - overwrite it later from open_binary_frm().

don't add support for historical features into the API.

> +  @param db_create_opt   Flags of CREATE options
> +  @param avg_row_len     Average row length
> +  @param transact        CREATE option TRANSACTION
> +  @param page_chcksum    CREATE option PAGE_CHECK_SUM
> +  @param rtype           CREATE option ROW_TYPE (used by aria only)
> +  @param charset         default charset for the table.
> +  @param null_fld_first  Should be TRUE for modern frm

same as frm_version

> +  @param mi_rows         CREATE option MIN_ROWS
> +  @param ma_rows         CREATE option MAX_ROWS
> +  @param reclen          Record buffer length
> +  @param extra_rec_buf_len Extra record buffer length (???)
> +  @param key_blck_size   CREATE option KEY_BLOCK_SIZE
> +  @param new_fld_pck_flg Should be 2 for modern frm

same as frm_version

> +  @param error           Error code assigned here in case of error
> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +bool TABLE_SHARE::set_init_parameters(uchar frm_ver, ulong mysql_ver,
> +                                      uint db_create_opt,
> +                                      ulong avg_row_len,
> +                                      enum ha_choice transact,
> +                                      enum ha_choice page_chcksum,
> +                                      enum row_type rtype,
> +                                      CHARSET_INFO *charset,
> +                                      bool null_fld_first,
> +                                      ha_rows mi_rows,
> +                                      ha_rows ma_rows,
> +                                      ulong reclen,
> +                                      uint extra_rec_buf_len,
> +                                      uint key_blck_size,
> +                                      uint new_fld_pck_flg,
> +                                      int &error)

use a pointer, not a reference. Or, better, just return the error code

> +{
> +  frm_version= frm_ver;
> +  mysql_version= mysql_ver;
> +  db_options_in_use= db_create_options= db_create_opt;
> +  if (db_create_options & HA_OPTION_LONG_BLOB_PTR)
> +    blob_ptr_size= portable_sizeof_char_ptr;
> +  avg_row_length= avg_row_len;
> +  transactional= transact;
> +  page_checksum= page_chcksum;
> +  row_type= rtype;
> +  table_charset= charset;
> +  null_field_first= null_fld_first;
> +  db_record_offset= 1;
> +  min_rows= mi_rows;
> +  max_rows= ma_rows;
> +  stored_rec_length= reclength= reclen;
> +  key_block_size= key_blck_size;
> +  new_field_pack_flag= new_fld_pck_flg;
> +  rec_buff_length= ALIGN_SIZE(reclength + 1 + extra_rec_buf_len);
> +  if (!(default_values= (uchar *) alloc_root(&mem_root,
> +                                             rec_buff_length)))
> +  {
> +    error= 4;
> +    return TRUE;
> +  }
> +  /* Default values */
> +  system= 0;
> +  crypted= 0;
> +
> +  return FALSE;
> +}
> +
> +
> +/**
> +  Allocate space for indices and its parts
> +
> +  @param nkeys           Number of keys
> +  @param nkey_parts      Number of key parts of all keys
> +  @param names_len       Sum of lengths of all indices names (string ends
> +                         are not counted)
> +  @param error           Error code assigned here in case of error\

return the error code instead

> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +bool TABLE_SHARE::allocate_indices_space(uint nkeys, uint nkey_parts,
> +                                         uint names_len, int &error)
> +{
> +  uint n_length= (nkeys * sizeof(KEY) +
> +                  nkey_parts * sizeof(KEY_PART_INFO));
> +
> +  DBUG_ASSERT(key_info == NULL); // prevent reallocation
> +  DBUG_ASSERT(nkeys <= nkey_parts);
> +
> +  keys= nkeys;
> +  key_parts= nkey_parts;
> +  keys_for_keyread.init(0);
> +  keys_in_use.init(keys);
> +
> +
> +  if (!(rec_per_key= (ulong *) alloc_root(&mem_root,
> +                                          sizeof(ulong) * key_parts +
> +                                          n_length +
> +                                          (names_len + nkeys + 1))))
> +  {
> +    error= 4;
> +    return TRUE;                                   /* purecov: inspected */
> +  }
> +  key= key_info=
> +    my_reinterpret_cast(KEY*) (rec_per_key + key_parts);
> +  bzero((char*) key_info, n_length);
> +  first_key_part= key_part=
> +    my_reinterpret_cast(KEY_PART_INFO*) (key_info + keys);
> +  first_keyname= keyname=
> +    my_reinterpret_cast(char *) (first_key_part + key_parts);
> +  is_keys_allocated= TRUE;
> +  if (key_parts == 0)
> +  {
> +    DBUG_ASSERT(keys == 0);
> +    is_keynames_ready= is_keyparts_ready= TRUE;
> +    if (is_fieldnames_ready)
> +      parse_names();
> +    if (is_fields_ready &&
> +        connect_indexes_and_fields(error))
> +      return TRUE;
> +  }
> +  if (is_fields_allocated && final_fields_keys_allocation())
> +  {
> +    error= 4;
> +    return TRUE;
> +  }
> +  return FALSE;
> +}
> +
> +
> +/**
> +  Add index header
> +
> +  @param name            Name of the index (could be NULL if it is
> +                         already copied by add_index_names_as_onestring())
> +  @param name_len        Length of the index name (used only if name passed)
> +  @param flags           Key flags
> +  @param length          Length of the key (in its buffer)
> +  @param parts           Number of parts in the index
> +  @param alg             Index algorithm
> +  @param blk_size        Index block size
> +
> +  @retval pointer on the allocated index descriptor
> +*/
> +
> +KEY *TABLE_SHARE::add_index(char *name, uint name_len,
> +                            uint flags, uint length, uint parts,
> +                            enum ha_key_alg alg, uint blk_size)
> +{
> +  DBUG_ASSERT(key < (key_info + keys));
> +  DBUG_ASSERT(key->key_length == 0); // prevent redefinition
> +  DBUG_ASSERT(is_keys_allocated);
> +  key->flags= flags;
> +  key->key_length= length;

isn't key_length equal to the sum of keypart lengths?

> +  key->key_parts=  parts;
> +  key->algorithm=  alg;
> +  key->block_size= blk_size;
> +  if (name)
> +  {
> +    /*
> +      TODO: should be check bounds of the allocated string?
> +
> +      Check for illegal separating characters
> +    */
> +    DBUG_ASSERT(memchr(name, '\0', name_len) == NULL);
> +    DBUG_ASSERT(memchr(name, NAMES_SEP_CHAR, name_len) == NULL);
> +
> +    if (key == key_info)
> +      *(keyname++)= NAMES_SEP_CHAR;

why is that?

> +
> +    memcpy(keyname, name, name_len);
> +    keyname+= name_len;
> +    *(keyname++)= NAMES_SEP_CHAR;

please, don't do that. Don't ask the caller for the total length of all
key names, and don't allocate them in one alloc_root.
just set the pointer here: key->name= name.

the caller can allocate them in one array, or independently, or whatever -
don't bother enforcing anything specific here.

> +
> +    if (key == key_info + (keys - 1))
> +    {
> +      is_keynames_ready= TRUE;
> +      if (is_fieldnames_ready)
> +        parse_names();
> +    }
> +
> +    /* names should be all or none */
> +    DBUG_ASSERT(key == key_info || (key - 1)->name != NULL);
> +  }
> +  else
> +  {
> +    /* names should be all or none */
> +    DBUG_ASSERT(key == key_info || (key - 1)->name == NULL);
> +  }
> +  return key++;
> +}
> +
> +
> +/**
> +  Add part description to the index
> +
> +  @param keyinfo         The index where to add part
> +  @param fieldnr         Number of the field used in the index
> +  @param offset          Offset of the part in the key biffer

do you need that? The keypart starts where the previous one ends, doesn't it?

> +  @param keytype         Key type
> +  @param length          Length of the part
> +  @param flags           Flags of the part
> +  @param error           Error code assigned here in case of error

return the error instead

> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +bool TABLE_SHARE::add_index_part(KEY *keyinfo, uint16 fieldnr,
> +                                 uint offset, uint16 keytype,
> +                                 uint16 length, uint16 flags,
> +                                 int &error)
> +{
> +  if (!keyinfo->key_part)
> +  {
> +    // first key part for the key
> +    keyinfo->key_part=	 key_part;
> +    keyinfo->rec_per_key= rec_per_key;

this can be done as
  key[i].rec_per_key=key[i-1].rec_per_key+key[i-1].key_parts
not in add_index_part() but later, in something like
finalize_share() - function that is called at the end. where you fix
all typelibs and stuff.

> +  }
> +  *rec_per_key++= 0;
> +  key_part->fieldnr= fieldnr;
> +  key_part->offset= offset;
> +  key_part->key_type= keytype;
> +  key_part->length= length;
> +  key_part->key_part_flag= flags;
> +  key_part->store_length= key_part->length; // will be fixed
> +  key_part++;
> +
> +  if (key_part == first_key_part + key_parts)
> +  {
> +    is_keyparts_ready= TRUE;
> +    if (is_fields_ready &&
> +        connect_indexes_and_fields(error))
> +      return TRUE;
> +  }

you don't need any is_fields_ready, etc
move all that code to finalize_share()
and let the caller to call it at the end, when all fields and keys
are parsed. No need to try to detect this moment automatically
with various cursors.

> +  return FALSE;
> +}
> +
> +/**
> +  Allocate space for field information
> +
> +  @param fields_count    Number of the fields
> +  @param null_flds       Number of the fields which could be NULL
> +  @param enum_set_count  Number of fields which are ENUM or SET
> +  @param enum_set_values_count Sum of number of values of all fields which
> +                               are ENUM or SET
> +  @param field_names_length Sum length of all fields names (string ends are not
> +                            counted)

don't ask for this (see above)

> +  @param enum_set_names_length Sum length of all values of all fields which
> +                               are ENUM or SET (string ends are not counted)
> +  @param com_length       Sum ength of the all field comments

nor for this, nor enum_set_count, nor enum_set_values_count, nor
enum_set_names_length

> +  @param handler_file     handler - used to check fields/indices features
> +  @param tbl_com_length   Length of the table comment (if present) or 0
> +  @param tbl_com          reference on the table comment (if present)

move table comment related parameters somewhere
out of "allocate_fields_space" method.

> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +bool TABLE_SHARE::allocate_fields_space(uint fields_count,
> +                                        uint null_flds,
> +                                        uint enum_set_count,
> +                                        uint enum_set_values_count,
> +                                        uint field_names_length,
> +                                        uint enum_set_names_length,
> +                                        uint com_length,
> +                                        handler *handler_file,
> +                                        size_t tbl_com_length,
> +                                        char *tbl_com)
> +{
> +  stored_fields= fields= fields_count;
> +  vfields= 0;
> +  tmp_handler_file= handler_file;
> +  null_fields= null_flds;
> +  enum_set_values= enum_set_values_count;
> +  enums_sets= enum_set_count;
> +  field_names_length+= fields_count + 1; //add space for separators
> +  comments_length= com_length;
> +  /* add space for separators between names and intervals */
> +  enum_set_names_length+= enum_set_values_count + enum_set_count * 2;
> +  if (!(field_ptr= field = (Field **)
> +	alloc_root(&mem_root,
> +		   (uint) ((fields_count + 1) * sizeof(Field*) +
> +			   enum_set_count * sizeof(TYPELIB) +
> +                           field_names_length + enum_set_names_length +
> +                           comments_length))))
> +    return TRUE;                           /* purecov: inspected */
> +
> +  field_ptr[fields]= 0;  // End marker
> +  curr_interval= intervals= (TYPELIB*) (field + fields + 1);
> +  names= (char *) (intervals + enum_set_count);
> +  curr_interval_names= names + field_names_length;
> +  first_comment= comment_pos=
> +    (char *) (names + field_names_length + enum_set_names_length);
> +  if (!enum_set_count)
> +    intervals= 0;	 // For better debugging
> +  is_fields_allocated= TRUE;
> +  if (is_keys_allocated && final_fields_keys_allocation())
> +    return TRUE;
> +  comment.str= strmake_root(&mem_root, tbl_com,
> +                            (comment.length= tbl_com_length));
> +  /* init counters */
> +  null_bits_are_used= (null_fields != 0);
> +  if (null_field_first)
> +  {
> +    null_flags= null_pos= (uchar*) default_values;
> +    null_bit_pos= (db_create_options & HA_OPTION_PACK_RECORD) ? 0 : 1;
> +    /*
> +      null_bytes below is only correct under the condition that
> +      there are no bit fields.  Correct values is set below after the
> +      table struct is initialized
> +    */
> +    null_bytes= (null_fields + null_bit_pos + 7) / 8;
> +  }
> +#ifndef WE_WANT_TO_SUPPORT_VERY_OLD_FRM_FILES
> +  else
> +  {
> +    null_bytes= (null_fields + 7) / 8;
> +    null_flags= null_pos= (uchar*) (default_values + reclength - null_bytes);
> +    null_bit_pos= 0;
> +  }
> +#endif

we don't want to support anything but the latest frm version here.
all compatibility hacks belong to open_binary_frm directly.

> +  use_hash= fields >= MAX_FIELDS_BEFORE_HASH;
> +  if (use_hash)
> +    use_hash= !hash_init(&name_hash,
> +			 system_charset_info,
> +			 fields, 0, 0,
> +			 (hash_get_key) get_field_name, 0, 0);
> +
> +  return FALSE;
> +}
> +
> +bool TABLE_SHARE::final_fields_keys_allocation()

see above - move this and parse_names and the likes into 
one function, that the caller will call at the end.
don't try to detect when "the end" is.

> +{
> +  DBUG_ASSERT(is_fields_allocated && is_keys_allocated);
> +  if (!(interval_array=
> +        (const char **) alloc_root(&mem_root,
> +                                   (uint) ((fields + enum_set_values +
> +                                            keys + 3) * sizeof(char *)))))
> +    return TRUE;
> +  curr_interval_array= interval_array + fields + 1;
> +  return FALSE;
> +}
> +
> +/**
> +  Internal function triggeren when all names
> +  (indices, fields, enum and set) are allocated and copied
> +
> +
> +  @retval FALSE OK
> +  @retval TRUE  Error
> +*/
> +
> +bool TABLE_SHARE::parse_names()
> +{
> +  TYPELIB *interval;
> +  DBUG_ASSERT(is_keys_allocated && is_fields_allocated &&
> +              is_keynames_ready);
> +  fix_type_pointers(&interval_array, &fieldnames, 1, &names);
> +  if (is_fieldnames_ready)
> +    fix_type_pointers(&interval_array, intervals, enums_sets, &names);
> +  else
> +  {
> +    /* Intervals were passed to the add_field without preparation */
> +    names= curr_interval_names;
> +    interval_array= curr_interval_array;
> +    is_fieldnames_ready= TRUE;
> +  }
> +  if ((keyname= first_keyname))
> +    fix_type_pointers(&interval_array, &keynames, 1, &keyname);
> +  if (fieldnames.count != fields)
> +    return TRUE;
> +
> +  /* Set ENUM and SET lengths */
> +  for (interval= intervals;
> +       interval < intervals + enums_sets;
> +       interval++)
> +  {
> +    uint count= (uint) (interval->count + 1) * sizeof(uint);
> +    if (!(interval->type_lengths= (uint *) alloc_root(&mem_root, count)))
> +      return TRUE;
> +    for (count= 0; count < interval->count; count++)
> +    {
> +      char *val= (char*) interval->type_names[count];
> +      interval->type_lengths[count]= strlen(val);
> +    }
> +    interval->type_lengths[count]= 0;
> +  }
> +
> +  return FALSE;
> +}
> +
> +/**
> +  Add field description to the TABLE_SHARE
> +
> +  @param name            Field name
> +  @param pos             Field position in the record buffer

isn't it equal the end of the previous field?
(with special treatment for BIT)

> +  @param field_length    Length of the field
> +  @param pack_flag       Flags of the field

what's that?
I mean, I understand that it's some set of cryptic bits, but how would an
engine use that?

You know what, I think we're missing a usage example.
could you add it to the example engine (ha_example) support for discovery?
Something like that - a table with a fixed name, say, "example_discovered_table"
is created by discovery() on the example engine. That is, a fixed table name,
keep it as simple as possible. Fixed table structure.
Few fields, say, an int, primary key, and enum (Y,N) - or whatever.
because you haven't yet done the step 2 in our plan (extracting that part
of ALTER TABLE that converts TABLE_SHARE into Create_field), this
example discovery won't create frm files - it'll be completely in memory.
which is fine, for now.

> +  @param field_type      Field type
> +  @param charset         Caracter set infoemation of the field
> +  @param geom_type       Geometry data type (for geometry fields)
> +  @param auto_increment  Is the field auto increment
> +  @param enum_set        Definition of ENUM or SET (for appropriate type)

a complete typelib? probably it's an overkill. better to pass an array
of names, and you can build a typelib internally.

> +  @param comm_pos        Reference on the field commentary
> +  @param comment_length  Length of the field commentary
> +  @param error           Error code assigned here in case of error

as always, you can return the error code

> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +Field **TABLE_SHARE::add_field(const char *name,
> +                               uchar *pos,
> +                               uint32 field_length,
> +                               uint pack_flag,
> +                               enum_field_types field_type,
> +                               CHARSET_INFO *charset,
> +                               geometry_type geom_type,
> +                               bool auto_increment,
> +                               TYPELIB* enum_set,
> +                               char *comm_pos,
> +                               uint comment_length,
> +                               int &error)
> +{
> +  DBUG_ASSERT(is_fields_allocated);
> +  utype unireg_type= get_unireg_type(auto_increment,
> +                                     !f_no_default(pack_flag),
> +                                     FALSE,
> +                                     FALSE, f_maybe_null(pack_flag),
> +                                     field_type);
> +
> +  return add_field_int(name, pos, field_length,
> +                       pack_flag, field_type,
> +                       charset, geom_type, unireg_type,
> +                       enum_set, comm_pos, comment_length,
> +                       NULL, TRUE, error);
> +}
> +
> +
> +/**
> +  Add field description to the TABLE_SHARE (internal)
> +
> +  @param name            Field name
> +  @param pos             Field position in the record buffer
> +  @param field_length    Length of the field
> +  @param pack_flag       Flags of the field
> +  @param field_type      Field type
> +  @param charset         Caracter set infoemation of the field
> +  @param geom_type       Geometry data type (for geometry fields)
> +  @param unireg_type     Unireg type (or check flags)
> +  @param auto_increment  Is the field auto increment
> +  @param enum_set        Definition of ENUM or SET (for appropriate type)
> +  @param comm_pos        Reference on the field commentary
> +  @param comment_length  Length of the field commentary
> +  @param vcol_info       Virtual column information
> +  @param fld_stored_in_db TRUE for normal fields and persistent virtual fields
> +  @param error           Error code assigned here in case of error
> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +Field **TABLE_SHARE::add_field_int(const char *name,
> +                                   uchar *pos,
> +                                   uint32 field_length,
> +                                   uint pack_flag,
> +                                   enum_field_types field_type,
> +                                   CHARSET_INFO *charset,
> +                                   geometry_type geom_type,
> +                                   utype unireg_type,
> +                                   TYPELIB* enum_set,
> +                                   char *comm_pos,
> +                                   uint comment_length,
> +                                   Virtual_column_info *vcol_info,
> +                                   bool fld_stored_in_db,
> +                                   int &error)
> +{
> +  Field* reg_field;
> +  DBUG_ASSERT(is_fields_allocated);
> +  if (!is_fieldnames_ready)
> +  {
> +    if (enum_set)
> +    {
> +      copy_typelib(curr_interval,
> +                   &curr_interval_array, enum_set,
> +                   &curr_interval_names, NAMES_SEP_CHAR);

See? it's better to get an array of names from the caller and build the
typelib here - as you won't use the typelib that caller provided anyway

> +      enum_set= curr_interval++;
> +    }
> +    {
> +      uchar *tmp= (uchar*) strmov((char*) *curr_field_names, name);
> +      *tmp++= (uchar)NAMES_SEP_CHAR;
> +      *tmp= 0;
> +      name= curr_field_names;
> +      curr_field_names= (char*)tmp;
> +    }

just like with other names - let the caller handle the allocations,
you simply use the pointer that the caller provides.

> +  }
> +  reg_field = *field_ptr=
> +    make_field(this, pos,
> +               (uint32) field_length,
> +               null_pos, null_bit_pos,
> +               pack_flag,
> +               field_type,
> +               charset,
> +               geom_type,
> +               (utype) MTYP_TYPENR(unireg_type),

add the cast to MTYP_TYPENR definition and remove it here (and elsewhere)

> +               enum_set,
> +               name);
> +
> +  if (!reg_field)     // Not supported field type
> +  {
> +    error= 4;
> +    return NULL;
> +  }
> +  reg_field->field_index= (field_ptr - field);
> +  reg_field->vcol_info= vcol_info;
> +  reg_field->stored_in_db= fld_stored_in_db;
> +  DBUG_ASSERT(reg_field->field_index < fields);
> +  if (comm_pos && comment_length)
> +  {
> +    memcpy(comment_pos, comm_pos, comment_length);

same here and everywhere. I'll stop repeating "just use the pointer,
let the caller handle the allocation"

alternatively (here and everywhere) do alloc_root() - that is
the caller gives you a pointer, and you call alloc_root() anc copy
the data. Yes, it means that every string will be allocated in its
own chunk. And in the _int version of the function you use the pointers.
I mean something like that:

TABLE_SHARE::add_field(char *name, ...)
{
   ...
   return add_field_int(strdup_root(name), ... );

so, if one uses add_field, he does not need to think about memory
allocations, and does not need to calculate the sum of lengths,
and stuff like that.
while open_binary_frm() will use add_field_int() can will pass the pointers
directly into the big allocated chunk, as always.

The above applies to all strings, comments, enum/set, everything.

> +    DBUG_ASSERT(!is_comments_ready);
> +  }
> +  else
> +    DBUG_ASSERT(is_comments_ready || comment_length == 0);
> +  if (comment_length)
> +  {
> +    reg_field->comment.str= comment_pos;
> +    reg_field->comment.length= comment_length;
> +    comment_pos+= comment_length;

right, that's what I mean by "use the pointer directly, let the caller
to handle the allocations".

> +  }
> +  else
> +  {
> +    reg_field->comment.str= (char *)"";
> +    reg_field->comment.length= 0;

very minor stilistic details:
in these cases I prefer to use

   reg_field->comment= empty_lex_string;

it was introduced for exactly this purpose.

> +  }
> +  if (f_no_default(pack_flag))
> +    reg_field->flags|= NO_DEFAULT_VALUE_FLAG;
> +
> +  if (reg_field->unireg_check == UT_NEXT_NUMBER)
> +      found_next_number_field= field_ptr;
> +  if (timestamp_field == reg_field)
> +      timestamp_field_offset= reg_field->field_index;
> +
> +  if (use_hash)
> +  {
> +    if (my_hash_insert(&name_hash,
> +                       (uchar*) field_ptr))
> +    {
> +      /*
> +        Set return code 8 here to indicate that an error has
> +        occurred but that the error message already has been
> +        sent (OOM).
> +      */
> +      error= 8;
> +      return NULL;
> +    }
> +  }
> +
> +  /* move null bit/byte pointers */
> +  if (field_type == MYSQL_TYPE_BIT && !f_bit_as_char(pack_flag))
> +  {
> +    null_bits_are_used= 1;
> +    if ((null_bit_pos+= field_length & 7) > 7)
> +    {
> +      null_pos++;
> +      null_bit_pos-= 8;
> +    }
> +  }
> +  if (!(reg_field->flags & NOT_NULL_FLAG))
> +  {
> +    if (!(null_bit_pos= (null_bit_pos + 1) & 7))
> +      null_pos++;
> +  }
> +
> +  if (reg_field->field_index == fields - 1)
> +  {
> +    if (!is_fieldnames_ready)
> +
> +    is_fields_ready= TRUE;
> +    if (is_keyparts_ready &&
> +        connect_indexes_and_fields(error))
> +      return NULL;
> +  }
> +  return field_ptr++;
> +}
> +
> +/**
> +  Internal function which connect fields and indices (its parts) triggered
> +  when fields and index parts are ready
> +
> +  @param error           Error code assigned here in case of error
> +
> +  @retval FALSE OK
> +  @retval TRUE  Error (error assigned)
> +*/
> +
> +bool TABLE_SHARE::connect_indexes_and_fields(int &error)
> +{
> +  KEY *keyinfo;
> +  uint i;
> +  /* Fix key->name and key_part->field */
> +  if (key_parts)
> +  {
> +    KEY_PART_INFO *key_part=  first_key_part;
> +    uint prm_key=(uint) (find_type((char*) primary_key_name,
> +                                   &keynames, 3) - 1);
> +    longlong ha_option= tmp_handler_file->ha_table_flags();
> +    keyinfo= key_info;
> +
> +    for (uint key=0 ; key < keys ; key++, keyinfo++)
> +    {
> +      uint usable_parts= 0;
> +      keyinfo->name=(char*) keynames.type_names[key];
> +      keyinfo->name_length= strlen(keyinfo->name);
> +      keyinfo->cache_name=
> +        (uchar*) alloc_root(&mem_root,
> +                            table_cache_key.length+
> +                            keyinfo->name_length + 1);
> +      if (keyinfo->cache_name)           // If not out of memory
> +      {
> +        uchar *pos= keyinfo->cache_name;
> +        memcpy(pos, table_cache_key.str, table_cache_key.length);
> +        memcpy(pos + table_cache_key.length, keyinfo->name,
> +               keyinfo->name_length+1);
> +      }
> +
> +      /* Fix fulltext keys for old .frm files */
> +      if (key_info[key].flags & HA_FULLTEXT)
> +	key_info[key].algorithm= HA_KEY_ALG_FULLTEXT;
> +
> +      if (prm_key >= MAX_KEY && (keyinfo->flags & HA_NOSAME))
> +      {
> +	/*
> +	  If the UNIQUE key doesn't have NULL columns and is not a part key
> +	  declare this as a primary key.
> +	*/
> +	prm_key= key;
> +	for (i= 0 ; i < keyinfo->key_parts; i++)
> +	{
> +	  uint fieldnr= key_part[i].fieldnr;
> +	  if (!fieldnr ||
> +	      field[fieldnr-1]->null_ptr ||
> +	      field[fieldnr-1]->key_length() !=
> +	      key_part[i].length)
> +	  {
> +	    prm_key=MAX_KEY;		// Can't be used
> +	    break;
> +	  }
> +	}
> +      }
> +
> +      for (i=0 ; i < keyinfo->key_parts ; key_part++,i++)
> +      {
> +        Field *fld;
> +	if (new_field_pack_flag <= 1)
> +	  key_part->fieldnr= (uint16) find_field(field,
> +                                                 default_values,
> +                                                 (uint) key_part->offset,
> +                                                 (uint) key_part->length);
> +	if (!key_part->fieldnr)
> +        {
> +          error= 4;                             // Wrong file
> +          return TRUE;
> +        }
> +        fld= key_part->field= field[key_part->fieldnr-1];
> +        key_part->type= fld->key_type();
> +        if (fld->null_ptr)
> +        {
> +          key_part->null_offset=(uint) ((uchar*) fld->null_ptr -
> +                                        default_values);
> +          key_part->null_bit= fld->null_bit;
> +          key_part->store_length+=HA_KEY_NULL_LENGTH;
> +          keyinfo->flags|=HA_NULL_PART_KEY;
> +          keyinfo->key_length+= HA_KEY_NULL_LENGTH;
> +        }
> +        if (fld->type() == MYSQL_TYPE_BLOB ||
> +            fld->real_type() == MYSQL_TYPE_VARCHAR ||
> +            fld->type() == MYSQL_TYPE_GEOMETRY)
> +        {
> +          if (fld->type() == MYSQL_TYPE_BLOB ||
> +              fld->type() == MYSQL_TYPE_GEOMETRY)
> +            key_part->key_part_flag|= HA_BLOB_PART;
> +          else
> +            key_part->key_part_flag|= HA_VAR_LENGTH_PART;
> +          key_part->store_length+=HA_KEY_BLOB_LENGTH;
> +          keyinfo->key_length+= HA_KEY_BLOB_LENGTH;
> +        }
> +        if (fld->type() == MYSQL_TYPE_BIT)
> +          key_part->key_part_flag|= HA_BIT_PART;
> +
> +        if (i == 0 && key != prm_key)
> +          fld->flags |= (((keyinfo->flags & HA_NOSAME) &&
> +                           (keyinfo->key_parts == 1)) ?
> +                           UNIQUE_KEY_FLAG : MULTIPLE_KEY_FLAG);
> +        if (i == 0)
> +          fld->key_start.set_bit(key);
> +        if (fld->key_length() == key_part->length &&
> +            !(fld->flags & BLOB_FLAG))
> +        {
> +          if (tmp_handler_file->index_flags(key, i, 0) & HA_KEYREAD_ONLY)
> +          {
> +            keys_for_keyread.set_bit(key);
> +            fld->part_of_key.set_bit(key);
> +            fld->part_of_key_not_clustered.set_bit(key);
> +          }
> +          if (tmp_handler_file->index_flags(key, i, 1) & HA_READ_ORDER)
> +            fld->part_of_sortkey.set_bit(key);
> +        }
> +        if (!(key_part->key_part_flag & HA_REVERSE_SORT) &&
> +            usable_parts == i)
> +          usable_parts++;			// For FILESORT
> +        fld->flags|= PART_KEY_FLAG;
> +        if (key == prm_key)
> +        {
> +          fld->flags|= PRI_KEY_FLAG;
> +          /*
> +            If this field is part of the primary key and all keys contains
> +            the primary key, then we can use any key to find this column
> +          */
> +          if (ha_option & HA_PRIMARY_KEY_IN_READ_INDEX)
> +          {
> +            if (fld->key_length() == key_part->length &&
> +                !(fld->flags & BLOB_FLAG))
> +              fld->part_of_key= keys_in_use;
> +            if (fld->part_of_sortkey.is_set(key))
> +              fld->part_of_sortkey= keys_in_use;
> +          }
> +        }
> +        if (fld->key_length() != key_part->length)
> +        {
> +          key_part->key_part_flag|= HA_PART_KEY_SEG;
> +        }
> +        if (fld->real_maybe_null())
> +          key_part->key_part_flag|= HA_NULL_PART;
> +        /*
> +          Sometimes we can compare key parts for equality with memcmp.
> +          But not always.
> +        */
> +        if (!(key_part->key_part_flag & (HA_BLOB_PART | HA_VAR_LENGTH_PART |
> +                                         HA_BIT_PART)) &&
> +            key_part->type != HA_KEYTYPE_FLOAT &&
> +            key_part->type == HA_KEYTYPE_DOUBLE)
> +          key_part->key_part_flag|= HA_CAN_MEMCMP;
> +      }
> +      keyinfo->usable_key_parts= usable_parts; // Filesort
> +
> +      set_if_bigger(max_key_length,keyinfo->key_length+
> +                    keyinfo->key_parts);
> +      total_key_length+= keyinfo->key_length;
> +      /*
> +        MERGE tables do not have unique indexes. But every key could be
> +        an unique index on the underlying MyISAM table. (Bug #10400)
> +      */
> +      if ((keyinfo->flags & HA_NOSAME) ||
> +          (ha_option & HA_ANY_INDEX_MAY_BE_UNIQUE))
> +        set_if_bigger(max_unique_length,keyinfo->key_length);
> +    }
> +    if (prm_key < MAX_KEY &&
> +	(keys_in_use.is_set(prm_key)))
> +    {
> +      primary_key= prm_key;
> +      /*
> +	If we are using an integer as the primary key then allow the user to
> +	refer to it as '_rowid'
> +      */
> +      if (key_info[prm_key].key_parts == 1)
> +      {
> +	Field *fld= key_info[prm_key].key_part[0].field;
> +	if (fld && fld->result_type() == INT_RESULT)
> +        {
> +          /* note that fieldnr here (and rowid_field_offset) starts from 1 */
> +	  rowid_field_offset= (key_info[prm_key].key_part[0].
> +                               fieldnr);
> +        }
> +      }
> +    }
> +    else
> +      primary_key = MAX_KEY; // we do not have a primary key
> +  }
> +  else
> +    primary_key= MAX_KEY;
> +
> +  return FALSE;
> +}
> +
>  /*
>    Read data from a binary .frm file from MySQL 3.23 - 5.0 into TABLE_SHARE
>  */
> @@ -747,38 +1544,53 @@ static int open_binary_frm(THD *thd, TAB
>        legacy_db_type < DB_TYPE_FIRST_DYNAMIC)
>      share->db_plugin= ha_lock_engine(NULL, 
>                                       ha_checktype(thd, legacy_db_type, 0, 0));
> -  share->db_create_options= db_create_options= uint2korr(head+30);
> -  share->db_options_in_use= share->db_create_options;
> -  share->mysql_version= uint4korr(head+51);
> -  share->null_field_first= 0;
> -  if (!head[32])				// New frm file in 3.23
> -  {
> -    share->avg_row_length= uint4korr(head+34);
> -    share->transactional= (ha_choice) (head[39] & 3);
> -    share->page_checksum= (ha_choice) ((head[39] >> 2) & 3);
> -    share->row_type= (row_type) head[40];
> -    share->table_charset= get_charset((uint) head[38],MYF(0));
> -    share->null_field_first= 1;
> -  }
> -  if (!share->table_charset)
> -  {
> -    /* unknown charset in head[38] or pre-3.23 frm */
> -    if (use_mb(default_charset_info))
> -    {
> -      /* Warn that we may be changing the size of character columns */
> -      sql_print_warning("'%s' had no or invalid character set, "
> -                        "and default character set is multi-byte, "
> -                        "so character column sizes may have changed",
> -                        share->path.str);
> -    }
> -    share->table_charset= default_charset_info;
> +
> +
> +  {
> +    enum ha_choice transact= HA_CHOICE_UNDEF;
> +    enum ha_choice page_chcksum= HA_CHOICE_UNDEF;
> +    enum row_type rtype= ROW_TYPE_DEFAULT;
> +    CHARSET_INFO *charset= NULL;
> +    bool null_fld_first= 0;
> +    ulong avg_row_len= 0;
> +    if (!head[32])				// New frm file in 3.23
> +    {
> +      avg_row_len= uint4korr(head+34);
> +      transact= (ha_choice) (head[39] & 3);
> +      page_chcksum= (ha_choice) ((head[39] >> 2) & 3);
> +      rtype= (row_type) head[40];
> +      charset= get_charset((uint) head[38],MYF(0));
> +      null_fld_first= 1;
> +    }
> +    if (!charset)
> +    {
> +      /* unknown charset in head[38] or pre-3.23 frm */
> +      if (use_mb(default_charset_info))
> +      {
> +        /* Warn that we may be changing the size of character columns */
> +        sql_print_warning("'%s' had no or invalid character set, "
> +                          "and default character set is multi-byte, "
> +                          "so character column sizes may have changed",
> +                          share->path.str);
> +      }
> +      charset= default_charset_info;
> +    }
> +    if (share->
> +        set_init_parameters(frm_ver,
> +                            uint4korr(head + 51), /*MySQL Version of the frm*/
> +                            uint2korr(head + 30), /*Create options (old)*/
> +                            avg_row_len, transact, page_chcksum,
> +                            rtype, charset, null_fld_first,
> +                            uint4korr(head+22),   /*MIN_ROWS*/
> +                            uint4korr(head+18),   /*MAX_ROWS*/
> +                            uint2korr(head+16),   /*record length*/
> +                            uint2korr(head+59),   /*extra_rec_buf_length*/
> +                            uint2korr(head+62),   /*key block size*/
> +                            new_field_pack_flag,
> +                            error))

I'd say you can remove this method completely - it's just a set of
assignments, the caller can do them directly.
better try to provide reasonable defaults (in share->init()),
so that the caller would not need to assign most of those.

> +      goto free_and_err;
>    }
> -  share->db_record_offset= 1;
> -  if (db_create_options & HA_OPTION_LONG_BLOB_PTR)
> -    share->blob_ptr_size= portable_sizeof_char_ptr;
>    error=4;
> -  share->max_rows= uint4korr(head+18);
> -  share->min_rows= uint4korr(head+22);
>  
>    /* Read keyinformation */
>    key_info_length= (uint) uint2korr(head+28);
> @@ -787,84 +1599,85 @@ static int open_binary_frm(THD *thd, TAB
>      goto err;                                   /* purecov: inspected */
>    if (disk_buff[0] & 0x80)
>    {
> -    share->keys=      keys=      (disk_buff[1] << 7) | (disk_buff[0] & 0x7f);
> -    share->key_parts= key_parts= uint2korr(disk_buff+2);
> +    keys=      (disk_buff[1] << 7) | (disk_buff[0] & 0x7f);
> +    key_parts= uint2korr(disk_buff+2);
>    }
>    else
>    {
> -    share->keys=      keys=      disk_buff[0];
> -    share->key_parts= key_parts= disk_buff[1];
> +    keys=      disk_buff[0];
> +    key_parts= disk_buff[1];
>    }
> -  share->keys_for_keyread.init(0);
> -  share->keys_in_use.init(keys);
> -
> -  n_length=keys*sizeof(KEY)+key_parts*sizeof(KEY_PART_INFO);
> -  if (!(keyinfo = (KEY*) alloc_root(&share->mem_root,
> -				    n_length + uint2korr(disk_buff+4))))
> -    goto err;                                   /* purecov: inspected */
> -  bzero((char*) keyinfo,n_length);
> -  share->key_info= keyinfo;
> -  key_part= my_reinterpret_cast(KEY_PART_INFO*) (keyinfo+keys);
> -  strpos=disk_buff+6;
> -
> -  if (!(rec_per_key= (ulong*) alloc_root(&share->mem_root,
> -                                         sizeof(ulong)*key_parts)))
> +  if (share->allocate_indices_space(keys, key_parts,
> +                                    uint2korr(disk_buff + 4) - keys - 1,
> +                                    error))
>      goto err;
> +  strpos=disk_buff+6;
>  
> -  for (i=0 ; i < keys ; i++, keyinfo++)
> +  for (i=0 ; i < keys ; i++)
>    {
>      if (new_frm_ver >= 3)
>      {
> -      keyinfo->flags=	   (uint) uint2korr(strpos) ^ HA_NOSAME;
> -      keyinfo->key_length= (uint) uint2korr(strpos+2);
> -      keyinfo->key_parts=  (uint) strpos[4];
> -      keyinfo->algorithm=  (enum ha_key_alg) strpos[5];
> -      keyinfo->block_size= uint2korr(strpos+6);
> +      keyinfo= share->add_index(NULL, 0,
> +                                (uint) uint2korr(strpos) ^ HA_NOSAME,
> +                                (uint) uint2korr(strpos + 2),
> +                                (uint) strpos[4], (enum ha_key_alg) strpos[5],
> +                                uint2korr(strpos + 6));

make block_size an optional parameter, or let the caller set it directly

>        strpos+=8;
>      }
>      else
>      {
> -      keyinfo->flags=	 ((uint) strpos[0]) ^ HA_NOSAME;
> -      keyinfo->key_length= (uint) uint2korr(strpos+1);
> -      keyinfo->key_parts=  (uint) strpos[3];
> -      keyinfo->algorithm= HA_KEY_ALG_UNDEF;
> +      keyinfo= share->add_index(NULL, 0,
> +                                ((uint) strpos[0]) ^ HA_NOSAME,
> +                                (uint) uint2korr(strpos + 1),
> +                                (uint) strpos[3],
> +                                HA_KEY_ALG_UNDEF, 0);
>        strpos+=4;
>      }
>  
> -    keyinfo->key_part=	 key_part;
> -    keyinfo->rec_per_key= rec_per_key;
> -    for (j=keyinfo->key_parts ; j-- ; key_part++)
> +    for (j=keyinfo->key_parts ; j-- ; )
>      {
> -      *rec_per_key++=0;
> -      key_part->fieldnr=	(uint16) (uint2korr(strpos) & FIELD_NR_MASK);
> -      key_part->offset= (uint) uint2korr(strpos+2)-1;
> -      key_part->key_type=	(uint) uint2korr(strpos+5);
> -      // key_part->field=	(Field*) 0;	// Will be fixed later
>        if (new_frm_ver >= 1)
>        {
> -	key_part->key_part_flag= *(strpos+4);
> -	key_part->length=	(uint) uint2korr(strpos+7);
> +        if (share->add_index_part(keyinfo,
> +                                  (uint16)
> +                                  (uint2korr(strpos) & FIELD_NR_MASK),
> +                                  (uint) uint2korr(strpos + 2) - 1,
> +                                  (uint) uint2korr(strpos + 5),
> +                                  (uint) uint2korr(strpos + 7),
> +                                  *(strpos + 4),
> +                                  error))
> +        {
> +          DBUG_ASSERT(0); // it is not possible if keys defined first
> +        }
>  	strpos+=9;
>        }
>        else
>        {
> -	key_part->length=	*(strpos+4);
> -	key_part->key_part_flag=0;
> -	if (key_part->length > 128)
> +        uint16 length= *(strpos+4);
> +        uint16 flags= 0;
> +	if (length > 128)
>  	{
> -	  key_part->length&=127;		/* purecov: inspected */
> -	  key_part->key_part_flag=HA_REVERSE_SORT; /* purecov: inspected */
> +	  length&= 127;	          /* purecov: inspected */
> +	  flags= HA_REVERSE_SORT; /* purecov: inspected */
>  	}
> +        if (share->add_index_part(keyinfo,
> +                                  (uint16)
> +                                  (uint2korr(strpos) & FIELD_NR_MASK),
> +                                  (uint) uint2korr(strpos+2)-1,
> +                                  (uint) uint2korr(strpos+5),
> +                                  length, flags, error))
> +        {
> +          DBUG_ASSERT(0); // it is not possible if keys defined first
> +        }
>  	strpos+=7;
>        }
> -      key_part->store_length=key_part->length;
>      }
>    }
> -  keynames=(char*) key_part;
> -  strpos+= (strmov(keynames, (char *) strpos) - keynames)+1;
>  
> -  share->reclength = uint2korr((head+16));
> -  share->stored_rec_length= share->reclength;
> +  if (share->add_index_names_as_onestring(strpos, i))
> +    goto err;

No, please remove this method, let the caller do strmov as before.
it's a weird thing that open_binary_frm is doing, there's no need to
create an api for that, just keep it the old way.

To repeat: we want to have a reasonable and sane API for storage engines to
use, not an API that allows to do everything that frm format was
historically forcing us to do in openfrm().

> +  strpos+= i;
> +
>    if (*(head+26) == 1)
>      share->system= 1;				/* one-record-database */
>  #ifdef HAVE_CRYPTED_FRM
> @@ -1034,17 +1847,10 @@ static int open_binary_frm(THD *thd, TAB
>      }
>      DBUG_ASSERT(next_chunk <= buff_end);
>    }
> -  share->key_block_size= uint2korr(head+62);
>  
>    error=4;
> -  extra_rec_buf_length= uint2korr(head+59);
> -  rec_buff_length= ALIGN_SIZE(share->reclength + 1 + extra_rec_buf_length);
> -  share->rec_buff_length= rec_buff_length;
> -  if (!(record= (uchar *) alloc_root(&share->mem_root,
> -                                     rec_buff_length)))
> -    goto free_and_err;                          /* purecov: inspected */
> -  share->default_values= record;
> -  if (my_pread(file, record, (size_t) share->reclength,
> +  /* TODO: make copy function for handler */

what does this comment mean?

> +  if (my_pread(file, share->default_values, (size_t) share->reclength,
>                 record_offset, MYF(MY_NABP)))
>      goto free_and_err;                          /* purecov: inspected */
>  
> @@ -2785,6 +3271,27 @@ fix_type_pointers(const char ***array, T
>  } /* fix_type_pointers */
>  
>  
> +static void
> +copy_typelib(TYPELIB *type_copy,
> +             const char ***array, TYPELIB *point_to_type,
> +             char **names, char chr)
> +{
> +  *((*names)++)= chr; // full compatibility, just to be safe
> +  type_copy->name= 0;
> +  type_copy->type_names= *array;
> +  type_copy->count= point_to_type->count;
> +  for (uint i= 0; i < point_to_type->count; i++)
> +  {
> +    *((*array)++) = *names;
> +    uchar *tmp= (uchar*) strmov((char*) *names, type_copy->type_names[i]);
> +    *tmp++= (uchar)0;
> +    *names= (char *)tmp;
> +  }
> +  *((*array)++)= NullS;		/* End of type */
> +  *((*names)++)= 0;
> +}
> +
> +
>  TYPELIB *typelib(MEM_ROOT *mem_root, List<String> &strings)
>  {
>    TYPELIB *result= (TYPELIB*) alloc_root(mem_root, sizeof(TYPELIB));

I don't understand how you handle our new table/field/index options,
those that can be defined per storage engine

> === modified file 'sql/table.h'
> --- a/sql/table.h	2011-08-17 05:48:35 +0000
> +++ b/sql/table.h	2011-10-04 11:14:59 +0000
> @@ -484,6 +485,39 @@ typedef struct st_table_share
>    void *ha_data;
>    void (*ha_data_destroy)(void *); /* An optional destructor for ha_data. */
>  
> +  /*
> +    Variables of building the TABLE_SHARE.
> +
> +    TODO: it might have sense to make separate structure for this variables
> +    and deallocate it after building
> +  */

Agree!

It would be better to remove all this added functionality (all new methods
and variables) from the TABLE_SHARE and create a separate class
TABLE_SHARE_builder (or whatever), and have it allocated on the stack, like

int open_binary_frm( ... )
{
   TABLE_SHARE_builder builder(share);
   ...
      builder->add_field(...);
   ...
   ... etc ...
}

this looks like a cleaner approach - the builder and all its memory is
destroyed when the function returns. And the TABLE_SHARE itself
does not have the "building" methods - it always represents a complete
and frozen table structure.

> +  KEY_PART_INFO *first_key_part; // pointer on parts array
> +  KEY_PART_INFO *key_part; // current position during key parts adding
> +  KEY *key; // current position during keys adding
> +  ulong *rec_per_key;
> +  const char **interval_array;
> +  char *first_keyname; // pointer on the first index name string
> +  char *keyname; // pointer on the current index name string during adding
> +  char *names; // array of pointers to all names used in the definition
> +  char *curr_interval_names;
> +  TYPELIB *curr_interval;
> +  const char **curr_interval_array;
> +  char *first_comment, *comment_pos;
> +  char *curr_field_names;
> +  uchar *null_flags, *null_pos; //pointers to the NULL bits during field adding
> +  /* Needed for fixing indices according to handler abilities and features */
> +  handler *tmp_handler_file;
> +  Field **field_ptr;
> +  uint enum_set_values, enums_sets, comments_length;
> +  uint new_field_pack_flag;
> +  uchar null_bit_pos;
> +  bool use_hash; /* use hash for field names */
> +  bool null_bits_are_used;
> +  /* Flags of the building progress (also used by triggers) */
> +  bool is_keys_allocated, is_fields_allocated,
> +       is_keynames_ready, is_fieldnames_ready, is_keyparts_ready,
> +       is_comments_ready, is_fields_ready;
> +  /* end of declaration of variables for building TABLE_SHARE*/
>  
>    /*
>      Set share's table cache key and update its db and table name appropriately.
> @@ -652,6 +686,96 @@ typedef struct st_table_share
>      return (tmp_table == SYSTEM_TMP_TABLE || is_view) ? 0 : table_map_id;
>    }
>  
> +   /* Table_share building functions */
> +   void init_metadata()
> +   {
> +     is_keys_allocated= is_fields_allocated=
> +       is_keynames_ready= is_fieldnames_ready= is_comments_ready=
> +       is_keyparts_ready= is_fields_ready= FALSE;
> +   }
> +   bool set_init_parameters(uchar frm_ver, ulong mysql_ver, uint db_create_opt,
> +                            ulong avg_row_len, enum ha_choice transact,
> +                            enum ha_choice page_chcksum, enum row_type rtype,
> +                            CHARSET_INFO *charset, bool null_fld_first,
> +                            ha_rows mi_rows, ha_rows ma_rows,
> +                            ulong reclen, uint extra_rec_buf_len,
> +                            uint key_blck_size, uint new_fld_pck_flg,
> +                            int &error);
> +   bool allocate_indices_space(uint keys, uint parts, uint names_len,
> +                               int &error);
> +   KEY *add_index(char *name, uint name_len,
> +                  uint flags, uint length, uint parts,
> +                  enum ha_key_alg alg, uint blk_size);
> +   bool add_index_part(KEY *keyinfo, uint16 fieldnr,
> +                       uint offset, uint16 keytype,
> +                       uint16 length, uint16 flags, int &error);
> +   inline bool add_index_names_as_onestring(uchar *pos, uint &len)
> +   {
> +     DBUG_ASSERT(is_keys_allocated);
> +     DBUG_ASSERT(keyname == first_keyname); // add_key did not used names
> +     len= (strmov(first_keyname, (char *) pos) - first_keyname) + 1;
> +     is_keynames_ready= TRUE;
> +     if (is_fieldnames_ready && parse_names())
> +       return TRUE;
> +     return FALSE;
> +   }
> +   inline bool add_field_enum_set_names_as_onestring(uchar *pos, uint len)
> +   {
> +     DBUG_ASSERT(is_fields_allocated);
> +     memcpy((char*) names, pos, len);
> +     is_fieldnames_ready= TRUE;
> +     if (is_keynames_ready && parse_names())
> +       return TRUE;
> +     return FALSE;
> +   }
> +   inline void add_comments_as_one_string(const uchar *pos)
> +   {
> +     DBUG_ASSERT(is_fields_allocated);
> +     memcpy(first_comment, pos, comments_length);
> +     is_comments_ready= TRUE;
> +   }

See above, I don't think that these *_onestring functions are necessary.
if you use the pointers are provided by the caller,
than open_binary_frm can parse these "one strings" as it wants
and pass pointers down. Thus keeping this trickery out of the api.

> +
> +   bool allocate_fields_space(uint fields_count,
> +                              uint null_flds,
> +                              uint enum_set_count,
> +                              uint enum_set_values_count,
> +                              uint field_names_length,
> +                              uint enum_set_names_length,
> +                              uint com_length,
> +                              handler *handler_file,
> +                              size_t tbl_com_length= 0,
> +                              char *tbl_com= NULL);
> +   Field **add_field(const char *name,
> +                     uchar *pos,
> +                     uint32 field_length,
> +                     uint pack_flag,
> +                     enum_field_types field_type,
> +                     CHARSET_INFO *charset,
> +                     geometry_type geom_type,
> +                     bool auto_increment,
> +                     TYPELIB* enum_set,
> +                     char *comm_pos,
> +                     uint comment_length,
> +                     int &error);
> +   Field **add_field_int(const char *name,
> +                         uchar *pos,
> +                         uint32 field_length,
> +                         uint pack_flag,
> +                         enum_field_types field_type,
> +                         CHARSET_INFO *charset,
> +                         geometry_type geom_type,
> +                         utype unireg_type,
> +                         TYPELIB* enum_set,
> +                         char *comment_pos,
> +                         uint comment_length,
> +                         Virtual_column_info *vcol_info,
> +                         bool fld_stored_in_db,
> +                         int &error);
> +private:
> +   bool final_fields_keys_allocation();
> +   bool parse_names();
> +   bool connect_indexes_and_fields(int &error);
> +
>  } TABLE_SHARE;
>  

Regards,
Sergei