trafodion-firefighters team mailing list archive
-
trafodion-firefighters team
-
Mailing list archive
-
Message #00382
jdbc_test failures - Daily build 2015-04-15 08:30:00 UTC of trafodion/core -- Test Failures
I created a bug id earlier today and was gathering some info. I'm trying to check if these failures we started noticing after change in the cloudera config on 10th April.
https://bugs.launchpad.net/bugs/1444599
In this failure also noticing the following:
http://logs.trafodion.org/daily/jdbc_test-cm5.3/89649db/
>From console output ( Batch ALL test times out - usually completes in 2-3 mins):
2015-04-15 12:36:17 Batch : total of 6000 rows inserted
2015-04-15 12:36:22 Batch : Passed
2015-04-15 13:17:41 Build timed out (after 90 minutes). Marking the build as failed.
2015-04-15 13:17:41 Build was aborted
2015-04-15 13:17:41 [PostBuildScript] - Execution post build scripts.
In dtm logs:
2015-04-15 12:41:12,775 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
2015-04-15 12:41:12,790 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
2015-04-15 12:41:12,791 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
2015-04-15 12:41:12,794 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
2015-04-15 12:41:12,801 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
In master logs:
2015-04-15 12:37:23,454 INFO org.apache.hadoop.hbase.master.RegionStates: Onlined be079188e0d87f47c814f81aa320ed43 on slave-cm53.trafodion.org,60020,1429100734898
2015-04-15 12:41:15,009 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [slave-cm53.trafodion.org,60020,1429100734898]
2015-04-15 12:41:15,017 INFO org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: Splitting hbase:meta logs for slave-cm53.trafodion.org,60020,1429100734898
2015-04-15 12:41:15,087 INFO org.apache.hadoop.hbase.master.SplitLogManager: dead splitlog workers [slave-cm53.trafodion.org,60020,1429100734898]
2015-04-15 12:41:15,090 INFO org.apache.hadoop.hbase.master.SplitLogManager: started splitting 1 logs in [hdfs://slave-cm53.trafodion.org:8020/hbase/WALs/slave-cm53.trafodion.org,60020,1429100734898-splitting]
2015-04-15 12:41:15,417 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 tasks={/hbase/splitWAL/WALs%2Fslave-cm53.trafodion.org%2C60020%2C1429100734898-splitting%2Fslave-cm53.trafodion.org%252C60020%252C1429100734898.1429100750787.meta=last_update = -1 last_version = -1 cur_worker_name = null status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
.......
2015-04-15 13:17:55,035 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 tasks={/hbase/splitWAL/WALs%2Fslave-cm53.trafodion.org%2C60020%2C1429100734898-splitting%2Fslave-cm53.trafodion.org%252C60020%252C1429100734898.1429100750787.meta=last_update = -1 last_version = -1 cur_worker_name = null status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
2015-04-15 13:18:00,036 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 tasks={/hbase/splitWAL/WALs%2Fslave-cm53.trafodion.org%2C60020%2C1429100734898-splitting%2Fslave-cm53.trafodion.org%252C60020%252C1429100734898.1429100750787.meta=last_update = -1 last_version = -1 cur_worker_name = null status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
2015-04-15 13:18:05,037 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 tasks={/hbase/splitWAL/WALs%2Fslave-cm53.trafodion.org%2C60020%2C1429100734898-splitting%2Fslave-cm53.trafodion.org%252C60020%252C1429100734898.1429100750787.meta=last_update = -1 last_version = -1 cur_worker_name = null status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
2015-04-15 13:18:11,037 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 tasks={/hbase/splitWAL/WALs%2Fslave-cm53.trafodion.org%2C60020%2C1429100734898-splitting%2Fslave-cm53.trafodion.org%252C60020%252C1429100734898.1429100750787.meta=last_update = -1 last_version = -1 cur_worker_name = null status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
Also the following in the regionmaster logs - indicating a restart.
2015-04-15 12:38:35,420 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1090ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1276ms
2015-04-15 12:38:40,408 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1120ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1216ms
2015-04-15 12:38:42,108 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1199ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1230ms
2015-04-15 12:38:45,087 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1285ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1306ms
...
2015-04-15 12:40:46,094 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1006ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=3 time=4681ms
2015-04-15 12:41:03,521 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 27833ms for sessionid 0x14cbd0a059c0000, closing socket connection and attempting reconnect
2015-04-15 12:41:14,898 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.3.1--1, built on 01/28/2015 00:41 GMT
From: Trafodion-firefighters [mailto:trafodion-firefighters-bounces+arvind.narain=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Chen, Alice (Trafodion)
Sent: Wednesday, April 15, 2015 11:26 AM
To: Varnau, Steve (Trafodion); Johnson, Stacey; trafodion-firefighters@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Trafodion-firefighters] Daily build 2015-04-15 08:30:00 UTC of trafodion/core -- Test Failures
The Phoenix part1 and part 2 T4 tests on Cloudera have also been timing more frequently.
https://jenkins02.trafodion.org/job/phoenix_part1_T4-cm5.3/buildTimeTrend
https://jenkins02.trafodion.org/job/phoenix_part2_T4-cm5.3/buildTimeTrend
Cheers,
Alice
From: Trafodion-firefighters [mailto:trafodion-firefighters-bounces+alice.chen=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Varnau, Steve (Trafodion)
Sent: Wednesday, April 15, 2015 11:13 AM
To: Johnson, Stacey; trafodion-firefighters@xxxxxxxxxxxxxxxxxxx<mailto:trafodion-firefighters@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Trafodion-firefighters] Daily build 2015-04-15 08:30:00 UTC of trafodion/core -- Test Failures
Looks like the jdbc_test-cm5.3 job has been timing out more frequently since Monday evening, which is also affecting some check/gate jobs.
https://jenkins02.trafodion.org/job/jdbc_test-cm5.3/buildTimeTrend
-Steve
From: Trafodion-firefighters [mailto:trafodion-firefighters-bounces+steve.varnau=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Johnson, Stacey
Sent: Wednesday, April 15, 2015 10:59
To: trafodion-firefighters@xxxxxxxxxxxxxxxxxxx<mailto:trafodion-firefighters@xxxxxxxxxxxxxxxxxxx>
Subject: [Trafodion-firefighters] Daily build 2015-04-15 08:30:00 UTC of trafodion/core -- Test Failures
[cid:image001.png@01D07775.FDECADD0]
Build failed.
- traf-pub-release-ahw2.2 http://logs.trafodion.org/daily/traf-pub-release-ahw2.2/bbeb7f2 : SUCCESS in 42m 23s
- traf-pub-debug-ahw2.2 http://logs.trafodion.org/daily/traf-pub-debug-ahw2.2/f95089a : SUCCESS in 34m 00s
- core-regress-core-cm5.3 http://logs.trafodion.org/daily/core-regress-core-cm5.3/81e8681 : SUCCESS in 2h 43m 40s
- core-regress-core-ahw2.2 http://logs.trafodion.org/daily/core-regress-core-ahw2.2/cb82357 : SUCCESS in 2h 13m 51s
- core-regress-charsets-cm5.3 http://logs.trafodion.org/daily/core-regress-charsets-cm5.3/99c65ee : SUCCESS in 1h 27m 25s
- core-regress-charsets-ahw2.2 http://logs.trafodion.org/daily/core-regress-charsets-ahw2.2/8b75695 : SUCCESS in 1h 43m 07s
- core-regress-qat-cm5.3 http://logs.trafodion.org/daily/core-regress-qat-cm5.3/66d1ced : SUCCESS in 1h 21m 22s
- core-regress-qat-ahw2.2 http://logs.trafodion.org/daily/core-regress-qat-ahw2.2/3246e89 : SUCCESS in 1h 30m 39s
- core-regress-udr-cm5.3 http://logs.trafodion.org/daily/core-regress-udr-cm5.3/a5288d2 : SUCCESS in 1h 14m 09s
- core-regress-udr-ahw2.2 http://logs.trafodion.org/daily/core-regress-udr-ahw2.2/cf87e05 : SUCCESS in 1h 26m 56s
- core-regress-catman1-cm5.3 http://logs.trafodion.org/daily/core-regress-catman1-cm5.3/088cd53 : SUCCESS in 2h 24m 35s
- core-regress-catman1-ahw2.2 http://logs.trafodion.org/daily/core-regress-catman1-ahw2.2/180b5cd : SUCCESS in 2h 35m 40s
- core-regress-compGeneral-cm5.3 http://logs.trafodion.org/daily/core-regress-compGeneral-cm5.3/c7926f1 : FAILURE in 2h 28m 38s
- core-regress-compGeneral-ahw2.2 http://logs.trafodion.org/daily/core-regress-compGeneral-ahw2.2/d52a532 : FAILURE in 2h 07m 51s
- core-regress-executor-cm5.3 http://logs.trafodion.org/daily/core-regress-executor-cm5.3/aa8bc02 : FAILURE in 4h 01m 56s
- core-regress-executor-ahw2.2 http://logs.trafodion.org/daily/core-regress-executor-ahw2.2/9e48377 : SUCCESS in 2h 18m 16s
- core-regress-fullstack2-cm5.3 http://logs.trafodion.org/daily/core-regress-fullstack2-cm5.3/255525d : SUCCESS in 58m 30s
- core-regress-fullstack2-ahw2.2 http://logs.trafodion.org/daily/core-regress-fullstack2-ahw2.2/621e99a : SUCCESS in 1h 06m 04s
- core-regress-hive-cm5.3 http://logs.trafodion.org/daily/core-regress-hive-cm5.3/261f8c7 : FAILURE in 1h 46m 16s
- core-regress-hive-ahw2.2 http://logs.trafodion.org/daily/core-regress-hive-ahw2.2/69fba37 : FAILURE in 2h 01m 26s
- core-regress-seabase-cm5.3 http://logs.trafodion.org/daily/core-regress-seabase-cm5.3/09b2351 : FAILURE in 4h 01m 50s
- core-regress-seabase-ahw2.2 http://logs.trafodion.org/daily/core-regress-seabase-ahw2.2/3d8ddd6 : SUCCESS in 2h 08m 39s
- phoenix_part1_T4-cm5.3 http://logs.trafodion.org/daily/phoenix_part1_T4-cm5.3/be849f1 : SUCCESS in 2h 15m 05s
- phoenix_part2_T4-cm5.3 http://logs.trafodion.org/daily/phoenix_part2_T4-cm5.3/5d58449 : FAILURE in 3h 22m 13s
- phoenix_part1_T4-ahw2.2 http://logs.trafodion.org/daily/phoenix_part1_T4-ahw2.2/9a6bbf3 : SUCCESS in 2h 16m 27s
- phoenix_part2_T4-ahw2.2 http://logs.trafodion.org/daily/phoenix_part2_T4-ahw2.2/a7011cf : SUCCESS in 2h 18m 47s
- phoenix_part1_T2-cm5.3 http://logs.trafodion.org/daily/phoenix_part1_T2-cm5.3/35a7a97 : FAILURE in 49m 00s (non-voting)
- phoenix_part2_T2-cm5.3 http://logs.trafodion.org/daily/phoenix_part2_T2-cm5.3/7261156 : FAILURE in 52m 43s (non-voting)
- phoenix_part1_T2-ahw2.2 http://logs.trafodion.org/daily/phoenix_part1_T2-ahw2.2/9f9c698 : FAILURE in 57m 33s (non-voting)
- phoenix_part2_T2-ahw2.2 http://logs.trafodion.org/daily/phoenix_part2_T2-ahw2.2/2a7da36 : FAILURE in 1h 03m 10s (non-voting)
- pyodbc_test-cm5.3 http://logs.trafodion.org/daily/pyodbc_test-cm5.3/3133265 : SUCCESS in 59m 30s
- pyodbc_test-ahw2.2 http://logs.trafodion.org/daily/pyodbc_test-ahw2.2/80b0036 : SUCCESS in 1h 06m 46s
- jdbc_test-cm5.3 http://logs.trafodion.org/daily/jdbc_test-cm5.3/89649db : FAILURE in 1h 31m 44s
- jdbc_test-ahw2.2 http://logs.trafodion.org/daily/jdbc_test-ahw2.2/822b1ef : SUCCESS in 1h 19m 12s
