← Back to team overview

pbxt-discuss team mailing list archive

Re: Problems with PBXT

 

Hi Paul,

On Sun, Apr 24, 2011 at 05:42:55PM +0200, Paul McCullagh wrote:
> Hi Erkan,
> 
> Looks like the sweeper has not completed its work for the recovery  
> phase. It may be hanging for some reason, but I don't know why this is  
> the case.

So maybe it is a nice idea to have a possibility to monitor the sweeper
etc. threads also.

Are there known (fixed) issues for the way pbxt allocates memory. I see
on a regular basis pbxt failing on long running transactions aka
INSERT INTO innodb_table SELECT * from pbxt_table.
Giving in the cli:
ERROR 1297 (HY000): Got temporary error -1 'Cannot allocate memory' from PBXT

And in the error-log:

110426 00:11:29 [Error] user_1 void* xt_malloc_ns(memory_xt.cc:156) errno (12): Cannot allocate memory
110426 00:11:29 [Error] user_1 void* xt_malloc_ns(memory_xt.cc:156)

Regards
Erkan


> 
> On Apr 23, 2011, at 5:42 PM, erkan yanar wrote:
> 
> >Moin,
> >Given: | 5.2.5-MariaDB-log |
> >      110423 18:16:22 PBXT 1.0.11-7 Pre-GA STATUS OUTPUT
> >
> >I was converting a table from xtradb -> pbxt
> >Because of misconfiguration the system swapped and:
> >110423 05:10:33 [Error] SW-mysql_data void*  
> >xt_malloc_ns(memory_xt.cc:156) errno (12): Cannot allocate memory
> >
> >So I did it the -9 way.
> >Lets have a look into the errorlog then:
> >
> >110423  5:11:06 Percona XtraDB (http://www.percona.com) 1.0.15-12.5  
> >started; log sequence number 221799126990
> >110423  5:11:06 [Note] Recovering after a crash using tc.log
> >110423  5:11:06 [Note] Starting crash recovery...
> >110423 05:11:07 [Note] PBXT: Recovering from 1-69, bytes to read:  
> >6844334523
> >110423 05:13:17 [Note] PBXT:  1  2  3  4  5  6  7  8  9 10 11 12 13  
> >14 15 16 17 18 19 20 21 22 23 24 25
> >110423 06:17:46 [Note] PBXT: 26 27 28 29 30 31 32 33 34 35 36 37 38  
> >39 40 41 42 43 44 45 46 47 48 49 50
> >110423 07:38:43 [Note] PBXT: 51 52 53 54 55 56 57 58 59 60 61 62 63  
> >64 65 66 67 68 69 70 71 72 73 74 75
> >110423 08:59:01 [Note] PBXT: 76 77 78 79 80 81 82 83 84 85 86 87 88  
> >89 90 91 92 93 94 95 96 97 98 99 100
> >110423 10:11:00 [Note] PBXT: Recovering complete at 191-11307935,  
> >bytes read: 6844334523
> >110423 10:11:00 [Note] Table pdns.records: free row count (1) has  
> >been set to the number of rows on the list: 1
> >110423 10:11:00 [Note] Crash recovery finished.
> >
> >Imho quit slow .. then:
> >110423 10:11:02 [Note] Waiting for 'mysql_data' sweeper...
> >FYI: datadir := /data/mysql/mysql_data
> >This is the last entry. Does it mean it is still waiting?
> >In fact I can't access the table it just hangs doing a simple select  
> >(limit 1)
> >
> >So doing a strace (at 23 18:23:07 ) I see :
> >pid  9633] <... pwrite resumed> )      = 12800
> >[pid  9633] pwrite(7, "\241-\1'www.5eabfd0749382558294a429f"...,  
> >11776, 3689697280) = 11776
> >[pid  9633] pwrite(7, ",-\1#28fca10a02a6064013d946a93c56"..., 11776,  
> >3707686912) = 11776
> >[pid  9633] pwrite(7, "\3103\1#5bee153b03d21ba03338c17bab94"...,  
> >13312, 3721564160 <unfinished ...>
> >[pid  8558] <... nanosleep resumed> NULL) = 0
> >[pid  8558] nanosleep({0, 10000000},  <unfinished ...>
> >[pid  9633] <... pwrite resumed> )      = 13312
> >[pid  9633] pwrite(7, "\2241\1#6453b2a23d7b45ba6199a68ebfa1"...,  
> >12800, 3745730560) = 12800
> >[pid  9633] pwrite(7, "N-\1(blog.6aa66275aedaeb43c6cd62b"..., 11776,  
> >3764375552) = 11776
> >[pid  9633] pwrite(7, "\237,\1#243877286abd33de621d0caf4bf5"...,  
> >11776, 3767013376 <unfinished ...>
> >[pid  9634] <... nanosleep resumed> NULL) = 0
> >[pid  9634] nanosleep({0, 10000000},  <unfinished ...>
> >[pid  9633] <... pwrite resumed> )      = 11776
> >[pid  9633] pwrite(7, "\266-\1(blog.ed01180d84db6d59f9af453"...,  
> >11776, 3777302528 <unfinished ...>
> >[pid  8558] <... nanosleep resumed> NULL) = 0
> >[pid  8558] nanosleep({0, 10000000},  <unfinished ...>
> >[pid 10561] <... nanosleep resumed> NULL) = 0
> >[pid 10561] nanosleep({0, 100000000},  <unfinished ...>
> >[pid  9633] <... pwrite resumed> )      = 11776
> >[pid  9633] pwrite(7, "\2740\1*schock.3ce6deea5e12d77de535d"...,  
> >12800, 3791163392 <unfinished ...>
> >[pid  9634] <... nanosleep resumed> NULL) = 0
> >[pid  9634] nanosleep({0, 10000000}, ^C <unfinished ...>
> >
> >
> >This is the table data. Does this means the sweeper is still  
> >working? Why can't I access the table?
> >
> >root@localhost [pbxt]> show global variables like '%pbxt%';
> >+------------------------------+-------+
> >| Variable_name                | Value |
> >+------------------------------+-------+
> >| pbxt_auto_increment_mode     | 0     |
> >| pbxt_checkpoint_frequency    | 24M   |
> >| pbxt_data_file_grow_size     | 50M   |
> >| pbxt_data_log_threshold      | 256M  |
> >| pbxt_flush_log_at_trx_commit | 0     |
> >| pbxt_garbage_threshold       | 50    |
> >| pbxt_index_cache_size        | 3G    |
> >| pbxt_log_buffer_size         | 512M  |
> >| pbxt_log_cache_size          | 1G    |
> >| pbxt_log_file_count          | 10    |
> >| pbxt_log_file_threshold      | 32MB  |
> >| pbxt_max_threads             | 2007  |
> >| pbxt_offline_log_function    | 0     |
> >| pbxt_record_cache_size       | 3G    |
> >| pbxt_row_file_grow_size      | 10M   |
> >| pbxt_support_xa              | ON    |
> >| pbxt_sweeper_priority        | 2     |
> >| pbxt_transaction_buffer_size | 8M    |
> >+------------------------------+-------+
> >
> >select * from statistics;
> >
> >+----+-----------------------+-------------+
> >|  1 | Current Time          |  1303576276 |
> >|  2 | Time Since Last Call  | 47879149862 |
> >|  3 | Commit Count          |           0 |
> >|  4 | Rollback Count        |           0 |
> >|  5 | Wait for Xact Count   |           0 |
> >|  6 | Dirty Xact Count      |           4 |
> >|  7 | Read Statements       |           0 |
> >|  8 | Write Statements      |           0 |
> >|  9 | Record Bytes Read     |  3420546564 |
> >| 10 | Record Bytes Written  |   442544326 |
> >| 11 | Record File Flushes   |           9 |
> >| 12 | Record Flush Time     |    30046591 |
> >| 13 | Record Cache Hits     |    28554990 |
> >| 14 | Record Cache Misses   |      234060 |
> >| 15 | Record Cache Frees    |      146298 |
> >| 16 | Record Cache Usage    |  2898603144 |
> >| 17 | Index Bytes Read      |  4162967552 |
> >| 18 | Index Bytes Written   |  2231254528 |
> >| 19 | Index File Flushes    |         159 |
> >| 20 | Index Flush Time      |  4939549599 |
> >| 21 | Index Cache Hits      |   744372857 |
> >| 22 | Index Cache Misses    |     1826952 |
> >| 23 | Index Cache Usage     |  3221225472 |
> >| 24 | Index Log Bytes In    |  3944771819 |
> >| 25 | Index Log Bytes Out   |   681640564 |
> >| 26 | Index Log File Syncs  |         639 |
> >| 27 | Index Log Sync Time   |  2048645036 |
> >| 28 | Xact Log Bytes In     |  2523484015 |
> >| 29 | Xact Log Bytes Out    |   873305600 |
> >| 30 | Xact Log File Syncs   |       10315 |
> >| 31 | Xact Log Sync Time    |  3357323411 |
> >| 32 | Xact Log Cache Hits   |       20337 |
> >| 33 | Xact Log Cache Misses |      209087 |
> >| 34 | Xact Log Cache Usage  |  1073708456 |
> >| 35 | Data Log Bytes In     |           0 |
> >| 36 | Data Log Bytes Out    |           0 |
> >| 37 | Data Log File Syncs   |           0 |
> >| 38 | Data Log Sync Time    |           0 |
> >| 39 | Bytes to Checkpoint   |  6818078139 |
> >| 40 | Log Bytes to Write    |   125051695 |
> >| 41 | Log Bytes to Sweep    |  6818078139 |
> >| 42 | Sweeper Wait on Xact  |           0 |
> >| 43 | Index Scan Count      |           1 |
> >| 44 | Table Scan Count      |           0 |
> >| 45 | Select Row Count      |           1 |
> >| 46 | Insert Row Count      |           0 |
> >| 47 | Update Row Count      |           0 |
> >| 48 | Delete Row Count      |           0 |
> >+----+-----------------------+-------------+
> >
> >So why can't I access the table?
> >What did I wrong?
> >
> >Regards
> >Erkan
> >
> >-- 
> >über den grenzen muß die freiheit wohl wolkenlos sein
> >
> >_______________________________________________
> >Mailing list: https://launchpad.net/~pbxt-discuss
> >Post to     : pbxt-discuss@xxxxxxxxxxxxxxxxxxx
> >Unsubscribe : https://launchpad.net/~pbxt-discuss
> >More help   : https://help.launchpad.net/ListHelp
> 

-- 
über den grenzen muß die freiheit wohl wolkenlos sein 


Follow ups

References