← Back to team overview

maria-discuss team mailing list archive

Re: Very slow insert with load data infile

 

Yep,

myisam_sort_buffer_size and key_buffer_size were set both to 6G and show variables confirmed it.

Moreover, when I created non-unique index both buffers were used (mysql was using 12G of RAM), however, when I created UNINQUE index only key_buffer was used (mysql was using 6G of RAM).

I'll do a bug report.

Le 12/02/2014 13:23, Sergey Vojtovich a écrit :
Hi Pierre,

there are quite a few MySQL bug reports in MySQL bug database that can
affect your use case. Like:
http://bugs.mysql.com/bug.php?id=5731
http://bugs.mysql.com/bug.php?id=29446
http://bugs.mysql.com/bug.php?id=59925
http://bugs.mysql.com/bug.php?id=62570
http://bugs.mysql.com/bug.php?id=62827
http://bugs.mysql.com/bug.php?id=45702

BUG#62827 is probably the most interesting. Even though some of them are fixed,
it is very likely that they're not fixed completely. I tend to remember 32-bit
variables on both key cache and "repair by sort" ways.

Could you check actual value of myisam_sort_buffer_size and key_buffer_size
just to make sure it wasn't cut at startup at least?

I'd suggest to report a bug in jira: https://mariadb.atlassian.net
I believe it is somthing worth checking at least.

Regards,
Sergey

On Mon, Feb 10, 2014 at 01:45:34PM +0100, Pierre wrote:
Hello,

Mariadb is getting slower and slower when I'm inserting a massive
amout of data. I'm trying to insert 166 507 066 rows (12go) using
load data infile '' into an empty table. I splitted my file in 13
parts of the same size because it was too long to insert in one
shot. When I inserted more than 100M rows, it starts to be extremely
slow.

Here are the time for each load data infile :

LOAD DATA INFILE '/tmp/hash/xaa' : 2min 49sec
LOAD DATA INFILE '/tmp/hash/xab' : 2min 49sec
LOAD DATA INFILE '/tmp/hash/xac' : 3min 48sec
LOAD DATA INFILE '/tmp/hash/xad' : 3min 48sec
LOAD DATA INFILE '/tmp/hash/xae' : 3min 49sec
LOAD DATA INFILE '/tmp/hash/xaf' : 5min 59sec
LOAD DATA INFILE '/tmp/hash/xag' : 10min 50sec
LOAD DATA INFILE '/tmp/hash/xah' : 20min 7sec
LOAD DATA INFILE '/tmp/hash/xai' : 47min 7sec
LOAD DATA INFILE '/tmp/hash/xaj' : 4 hours 1 min 9.34 sec ( ouch !)
LOAD DATA INFILE '/tmp/hash/xak' : still running...

Now the task is running but mariadb is only using 2% cpu and read to
the disk at 2500 Kb/sec (which is slow regarding the disk
performance). There is a lot of disk space. If insertion speed
continue to slow down at this rate, it will take weeks to insert the
3 remainings files ! It's only 166M rows so I think it's not normal.
If I disable the "unique" constraint insertion speed is great but I
need to be sure there is no duplicate and I also need an index.

Can someone explain me what is happening internaly and why it is so slow ?
Do you think if I just create an index (not a unique index) it will
be faster ? The problem is I specially delegated this task of unique
checking to mysql.
If no solution, do you know a database or a key/value store with
better for performance for this use case ? (create an index and
remove duplicate ?) I can rearrange my table structure (at a cost of
more disk space usage) to match a key/value structure.

Here is my current table structure :
CREATE TABLE `wallets` (
   `p1` varbinary(20) DEFAULT NULL,
   `p2` varbinary(20) DEFAULT NULL,
   `data` varbinary(32) DEFAULT NULL,
   UNIQUE KEY `p1` (`p1`),
   UNIQUE KEY `p2` (`p2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=FIXED;


Computer specs: 16go RAM and 4x3.1 ghz, 2x1To (with 90% unused disk space)

The configuration variables (I tweaked specially for this insertion job) :

key_buffer_size = 6G # MyISAM: used with insert to an empty table
bulk_insert_buffer_size = 6G # MyISAM/Aria: insert to a non empty table; default 8M
myisam_sort_buffer_size = 6G # MyISAM: used for sort but only when :
"CREATE INDEX", "ALTER TABLE" or "REPAIR TABLE"
read_buffer_size = 6G # MyISAM/Aria/MERGE: allocated for each table
scan, for order by, nested query caching, bulk insert into
partitions
sort_buffer_size = 4M # ALL: allocated each time a session need to do a sort
myisam_max_sort_file_size = 128G # MyISAM: tmp file max size

I'm using mariadb 5.5.31.

Best regards,

_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


References