maria-developers team mailing list archive

Thread
Date

Re: Archive Storage Engine

To: Varun Gupta <varungupta1803@xxxxxxxxx>
From: Sergey Petrunia <sergey@xxxxxxxxxxx>
Date: Thu, 9 Feb 2017 00:55:54 +0300
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20170207211950.GB30641@meddwl.fritz.box>
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Varun,

On Tue, Feb 07, 2017 at 10:19:50PM +0100, Sergei Golubchik wrote:
> 
> I almost replied to your email with "it's impossible, Archive did not
> have HA_RECORD_MUST_BE_CLEAN_ON_WRITE in the table_flags(), no engine
> did. So removal could not have changed anything".
> 
> But then I noticed that the code - by some old typo - had "&&" not "&".
> So it was always doing restore_record() in valgrind-aware binaries.
> For all engines. Apparently, it was never the original intention.
> 
> Note that in release binaries HAVE_valgrind is not defined, and the
> record is not bzero-ed even in 10.1 and earlier versions. This is a bug
> in Archive engine, in all versions, not in 10.2 only. In should not
> store the uninitialized tail of VARCHAR columns.
> 

Note that all this only happens in the "old" data format of the archive
tables. The new data format stores VARCHAR columns as variable-length data
and doesnt have this problem.

As for the old format, we can't change it. 
I guess the solution is that ha_archive::pack_row_v1() should fill the padding 
bytes in record_buffer->buffer (not in 'uchar *record' argument) with zeros.

I've did varchar packing for other storage engines, so I'll provide here some
pointers on how to do it:

* The data in "uchar *record" argument of pack_row_v1() is in the data format
  that we call "table->record[0] format"

* In that data format, VARCHAR column has a fixed space.
It starts with this offset:
    (field->ptr - table->record[0])
and takes field->pack_length_in_rec() bytes.
  
* field->is_real_null(record - table->record[0]) tells whether the field has a
NULL value. for NULLs, all varchar value is garbage

* for non-NULL values:
field_varchar->length_bytes  shows how many bytes store the value length.
Then, this many bytes have the actual value. The rest is garbage.

* You only need to handle VARCHARs, field->real_type() == MYSQL_TYPE_VARCHAR

Hope that helps.

> On Feb 07, Varun Gupta wrote:
> > Hi,
> > 
> > The commit is:
> > 
> > commit c697ddc315429c56dec567b2fe85cfe5c03834ea
> > Author: Sergei Golubchik <serg@xxxxxxxxxxx>
> > Date:   Mon Dec 12 15:47:51 2016 +0100
> > 
> >     cleanup: remove unused handler table flag
> > 
> > diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> > 
> > index 901fd48..9c21cb7 100644
> > --- a/sql/sql_insert.cc
> > +++ b/sql/sql_insert.cc
> > 
> > @@ -955,12 +955,7 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> >              be overwritten by fill_record() anyway (and fill_record() does
> > not
> >              use default values in this case).
> >            */
> > -#ifdef HAVE_valgrind
> > -          if (table->file->ha_table_flags() && HA_RECORD_MUST_BE_CLEAN_ON_WRITE)
> > -            restore_record(table,s->default_values);   // Get empty record
> > -          else
> > -#endif
> > -            table->record[0][0]= share->default_values[0];
> > +          table->record[0][0]= share->default_values[0];
> > 
> > On Sun, Feb 5, 2017 at 8:10 PM, Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
> > 
> > > Hi, Varun!
> > >
> > > On Feb 05, Varun Gupta wrote:
> > > > Hi,
> > > >
> > > > Here is the issue https://jira.mariadb.org/browse/MDEV-11645 on which I
> > > > need suggestions
> > > >
> > > > The observations made are:
> > > >
> > > > - in 10.1, varchar endspace was bzero-ed
> > > > archive depended on it to not pack garbage
> > > >
> > > > - in 10.2, varchar endspace is not bzero-ed
> > > >  which breaks the archive SE.
> > > >
> > > > Was this change intended?
> > > >
> > > > If yes, then we should update ha_archive to fill the garbage parts
> > > > of the buffer with zeroes that will cause it to be (slightly?)
> > > > slower  but there's no other way.
> > >
> > > Could you find what commit caused it? git bisect is quite useful for
> > > that.
> > >

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog

References

Archive Storage Engine
From: Varun Gupta, 2017-02-05
Re: Archive Storage Engine
From: Sergei Golubchik, 2017-02-05
Re: Archive Storage Engine
From: Varun Gupta, 2017-02-07
Re: Archive Storage Engine
From: Sergei Golubchik, 2017-02-07