maria-discuss team mailing list archive
Mailing list archive
Re: Very slow insert with load data infile
On Feb 12, Sergey Vojtovich wrote:
> > This is used when you insert data into the empty table. Or when you add
> > an index to the existing table. Or when you enable indexes. That is:
> > 1. LOAD DATA INFILE 'everything, not in chunks'
> > (a convenient way would be to load from a pipe, and cat all your
> > chunks into it).
> > 2. ALTER TABLE ... ADD KEY
> > 3. ALTER TABLE DISABLE KEYS. insert the data. ALTER TABLE ENABLE KEYS
> > But unfortunately, all this only works for non-unique keys, when MariaDB
> > doesn't need to expect a unique constraint error and won't need to
> > delete rows when creating indexes. For unique indexes values will be
> > inserted into a b-tree one by one.
> Indeed, it goes through key cache. Just curious why is that needed for
> p.2 - it should never have to delete rows. AFAIR it should just drop
> temporary table in case of failure. I guess it was fixed this way in 10.0.
> And p.1 can just truncate data file...
For repair-by-sort, ALTER TABLE needs first insert all rows, and only
then rebuild all indexes.
With UNIQUE indexes a conflict is possible, it will need to abort the
operation. Old assumption was that it's better to detect a conflict as
soon as possible - when rows are being inserted - and abort sooner. As
compared to copying everything, and only then during repair-by-sort
noticing a conflict. The second approach would end up doing much more
work before the operation is aborted.
There is some logic in that reasoning. But still, when UNIQUE index is
added, conflicts are typically rare. And even if we'd need to abort the
operation, doing more work and repair-by-sort is usually faster than
doing less work. Furthermore - what's even more important - most often
ALTER TABLE does not add a unique index, but performs some unrelated
operation on the table. In this case one cannot expect any conflicts at
all in the existing unique indexes.
So, in 10.0 we've changed ALTER TABLE to use repair-by-sort also for