maas-builds team mailing list archive
-
maas-builds team
-
Mailing list archive
-
Message #00200
Re: Jenkins Still Failing - saucy-adt-maas-daily 51
This and the saucy daily job failed because lenovo-rd230-03 failed to
complete the boot sequence due to a memory/battery error. I disabled
StopOnError for that specific machine so boot can continue. Kicking another
run of those two jobs to see if it'll have any effect.
On second thought, should MAAS be dealing with such hardware failures? For
instance, if I install MAAS to manage 100 nodes, and let's say one or two
of those nodes have issues during the boot sequence, how can MAAS help the
user realize there's a node that's supposed to come up online and never
did? Maybe it's not MAAS job to do it, maybe there are other tools to deal
with such situation. In any case, should I file a bug for this to be
considered in future versions of MAAS?
On third thought, maybe it's not a job for MAAS at all, since the hardware
failure happens before MAAS even knows about the machine exists. We use
ipmi to boot up the nodes and start the enlistment phase, so if they don't
show up in MAAS as declared, then it's the DC person's responsibility to
make sure the hardware is working in the first place.
What do you think?
Diogo
On Wed, Jan 29, 2014 at 10:41 AM, Jenkins Notification <
devnull@xxxxxxxxxxxxx> wrote:
> See http://d-jenkins.ubuntu-ci:8080/job/saucy-adt-maas-daily/51/
>
> [...truncated 4222 lines...]
> adt-run: & ubtree0t-maas-package-test:
> [----------------------------------------
> adt-run1: teeing to stdout:
> /tmp/adt-run.3lTU9g/ubtree0t-maas-package-test-testtmp/test_stdout, stderr:
> /tmp/adt-run.3lTU9g/ubtree0t-maas-package-test-testtmp/test_stderr
> networking stop/waiting
> networking start/running
> Ignoring indexes: https://pypi.python.org/simple/
> Downloading/unpacking nose-timer
> Running setup.py egg_info for package nose-timer
>
> Installing collected packages: nose-timer
> Running setup.py install for nose-timer
>
> Successfully installed nose-timer
> Cleaning up...
> maas-integration.TestMAASIntegration.test_create_admin ... ok
> maas-integration.TestMAASIntegration.test_restart_dbus_avahi ... ok
> maas-integration.TestMAASIntegration.test_update_maas_url ... ok
> maas-integration.TestMAASIntegration.test_restart_provisioning_server ...
> ok
> maas-integration.TestMAASIntegration.test_check_initial_services ... ok
> maas-integration.TestMAASIntegration.test_update_pxe_config ... ok
> maas-integration.TestMAASIntegration.test_import_pxe_files ... ok
> maas-integration.TestMAASIntegration.test_update_preseed_arm ... ok
> maas-integration.TestMAASIntegration.test_login_api ... ok
> maas-integration.TestMAASIntegration.test_set_http_proxy ... SKIP: Not
> testing proxy.
> maas-integration.TestMAASIntegration.test_cluster_connected ... ok
> maas-integration.TestMAASIntegration.test_set_up_dhcp_region ... ok
> maas-integration.TestMAASIntegration.test_check_dhcp_service ... ok
> maas-integration.TestMAASIntegration.test_set_up_dhcp_cluster ... SKIP:
> Not testing cluster controller
> maas-integration.TestMAASIntegration.test_update_dns_config ... ok
> maas-integration.TestMAASIntegration.test_boot_nodes_enlist ... ok
> maas-integration.TestMAASIntegration.test_check_nodes_declared ... ERROR
> SKIP: Not testing Cluster controller
>
> ======================================================================
> ERROR: maas-integration.TestMAASIntegration.test_check_nodes_declared
> ----------------------------------------------------------------------
> _StringException: Empty attachments:
> stderr for maas-cli maas ['nodes', 'list']
>
> retcode for maas-cli maas ['nodes', 'list']: {{{0}}}
> stdout for maas-cli maas ['nodes', 'list']: {{{
> [
> {
> "status": 0,
> "macaddress_set": [
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68682e4-88e1-11e3-98fc-525400123456/macs/00:e0:81:dd:d4:11/",
> "mac_address": "00:e0:81:dd:d4:11"
> },
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68682e4-88e1-11e3-98fc-525400123456/macs/00:e0:81:dd:d4:12/",
> "mac_address": "00:e0:81:dd:d4:12"
> }
> ],
> "hostname": "kbdap.master",
> "power_type": "ipmi",
> "routers": null,
> "netboot": true,
> "cpu_count": 0,
> "storage": 0,
> "system_id": "node-d68682e4-88e1-11e3-98fc-525400123456",
> "architecture": "amd64/generic",
> "memory": 0,
> "owner": null,
> "tag_names": [],
> "ip_addresses": [
> "192.168.21.11"
> ],
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68682e4-88e1-11e3-98fc-525400123456/"
> },
> {
> "status": 0,
> "macaddress_set": [
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68f9744-88e1-11e3-9ec8-525400123456/macs/00:e0:81:d1:b1:47/",
> "mac_address": "00:e0:81:d1:b1:47"
> },
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68f9744-88e1-11e3-9ec8-525400123456/macs/00:e0:81:d1:b1:48/",
> "mac_address": "00:e0:81:d1:b1:48"
> }
> ],
> "hostname": "8bj4d.master",
> "power_type": "ipmi",
> "routers": null,
> "netboot": true,
> "cpu_count": 0,
> "storage": 0,
> "system_id": "node-d68f9744-88e1-11e3-9ec8-525400123456",
> "architecture": "amd64/generic",
> "memory": 0,
> "owner": null,
> "tag_names": [],
> "ip_addresses": [
> "192.168.21.10"
> ],
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-d68f9744-88e1-11e3-9ec8-525400123456/"
> },
> {
> "status": 0,
> "macaddress_set": [
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-ed8eef1c-88e1-11e3-98fc-525400123456/macs/00:e0:81:dd:d1:0b/",
> "mac_address": "00:e0:81:dd:d1:0b"
> },
> {
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-ed8eef1c-88e1-11e3-98fc-525400123456/macs/00:e0:81:dd:d1:0c/",
> "mac_address": "00:e0:81:dd:d1:0c"
> }
> ],
> "hostname": "97wmf.master",
> "power_type": "ipmi",
> "routers": null,
> "netboot": true,
> "cpu_count": 0,
> "storage": 0,
> "system_id": "node-ed8eef1c-88e1-11e3-98fc-525400123456",
> "architecture": "amd64/generic",
> "memory": 0,
> "owner": null,
> "tag_names": [],
> "ip_addresses": [
> "192.168.21.12"
> ],
> "resource_uri":
> "/MAAS/api/1.0/nodes/node-ed8eef1c-88e1-11e3-98fc-525400123456/"
> }
> ]
> }}}
>
> Traceback (most recent call last):
> File
> "/tmp/adt-run.3lTU9g/ubtree0-build/real-tree/debian/tests/utils.py", line
> 69, in wrapper
> result = func(*args, **kwargs)
> File
> "/tmp/adt-run.3lTU9g/ubtree0-build/real-tree/debian/tests/maas-integration.py",
> line 532, in test_check_nodes_declared
> self._wait_nodes(0)
> File
> "/tmp/adt-run.3lTU9g/ubtree0-build/real-tree/debian/tests/maas-integration.py",
> line 524, in _wait_nodes
> sleep(5)
> File
> "/tmp/adt-run.3lTU9g/ubtree0-build/real-tree/debian/tests/utils.py", line
> 63, in _handle_timeout
> raise TimeoutError(error_message)
> TimeoutError: Timer expired
>
>
> maas-integration.TestMAASIntegration.test_update_pxe_config: 0.0004s
> maas-integration.TestMAASIntegration.test_check_dhcp_service: 0.0235s
> maas-integration.TestMAASIntegration.test_update_preseed_arm: 0.0379s
> maas-integration.TestMAASIntegration.test_check_initial_services: 0.0412s
> maas-integration.TestMAASIntegration.test_restart_provisioning_server:
> 0.0691s
> maas-integration.TestMAASIntegration.test_restart_dbus_avahi: 0.1235s
> maas-integration.TestMAASIntegration.test_create_admin: 0.1400s
> maas-integration.TestMAASIntegration.test_cluster_connected: 0.2190s
> maas-integration.TestMAASIntegration.test_update_maas_url: 1.1906s
> maas-integration.TestMAASIntegration.test_update_dns_config: 1.3023s
> maas-integration.TestMAASIntegration.test_boot_nodes_enlist: 3.2524s
> maas-integration.TestMAASIntegration.test_login_api: 4.2223s
> maas-integration.TestMAASIntegration.test_set_up_dhcp_region: 4.4793s
> maas-integration.TestMAASIntegration.test_check_nodes_declared: 420.0331s
> maas-integration.TestMAASIntegration.test_import_pxe_files: 488.0530s
> ----------------------------------------------------------------------
> Ran 17 tests in 923.330s
>
> FAILED (SKIP=3, errors=1)
> adt-run1: testbed executing test finished with exit status 1
> adt-run: & ubtree0t-maas-package-test:
> ----------------------------------------]
> adt-run: & ubtree0t-maas-package-test: - - - - - - - - - - results - - -
> - - - - - - -
> ubtree0t-maas-package-test FAIL non-zero exit status 1
> adt-run1: ** needs_reset, previously=False
> adt-run: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ tests done.
> adt-run1: ** stop
> adt-run1: ** close,
> scratch=tb-scratch~/tmp/adt-run.3lTU9g:-/|/tmp/adt-run.3lTU9g/!
> + RC=4
> + [ 4 -eq 20 ]
> + [ 0 -eq 1 ]
> + [ -x /home/ubuntu/adt-export-result ]
> + RES=PASS
> + [ 4 -gt 0 ]
> + RES=FAIL
> + /home/ubuntu/adt-export-result -D /root/adt-log maas FAIL
> + chown -R ubuntu /root/adt-log /var/tmp/testresults
> + chmod og+r /var/log/syslog
> + ls /var/crash/
> + [ -n ]
> + exit 4
> Connection to localhost closed.
> + RET=4
> + [ 0 -eq 1 ]
> + [ 4 -gt 0 ]
> + log_failure_msg adt-run exited with status 4.
> + log_msg Failure: adt-run exited with status 4.\n
> + date +%F %X
> + printf 2014-01-29 07:40:57: Failure: adt-run exited with status 4.\n
> 2014-01-29 07:40:57: Failure: adt-run exited with status 4.
> + [ 0 -eq 0 ]
> + mkdir -p /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results
> + ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o
> CheckHostIP=no -i /var/tmp/adt/disks/adtkey -p 54323 -tt -o BatchMode=yes
> -l ubuntu localhost sudo chown -R ubuntu /root/adt-log; find /root/adt-log
> -type f -empty | xargs rm 2>/dev/null
> Warning: Permanently added '[localhost]:54323' (ECDSA) to the list of
> known hosts.
> find: `/root/adt-log': Permission denied
> Connection to localhost closed.
> + true
> + scp -r -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o
> CheckHostIP=no -i /var/tmp/adt/disks/adtkey -P 54323 ubuntu@localhost:/root/adt-log/*
> /var/crash/*crash /var/log/syslog /var/tmp/testresults
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results
> + true
> + log_info_msg Test artifacts copied to
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results
> + log_msg Info: Test artifacts copied to
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results\n
> + date +%F %X
> + printf 2014-01-29 07:40:59: Info: Test artifacts copied to
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results\n
> 2014-01-29 07:40:59: Info: Test artifacts copied to
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results
> + [ -f
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results/summary.log
> ]
> + [ -n 2014-01-29_12-21-05 ]
> + ls -tr
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results/*.result
> + tail -1
> + resfile=
> + [ -n ]
> + log_failure_msg Test didn't end normally. Generating error file
> + log_msg Failure: Test didn't end normally. Generating error file\n
> + date +%F %X
> + printf 2014-01-29 07:40:59: Failure: Test didn't end normally.
> Generating error file\n
> 2014-01-29 07:40:59: Failure: Test didn't end normally. Generating error
> file
> + date +%Y%m%d-%H%M%S
> +
> errfile=/home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results/saucy_amd64_maas_20140129-074059.error
> + echo saucy amd64 maas
> + rsync -a
> /home/ubuntu/jenkins-jobs/workspace/saucy-adt-maas-daily/results/saucy_amd64_maas_20140129-074059.error
> /saucy/tmp/
> rsync: mkdir "/saucy/tmp" failed: No such file or directory (2)
> rsync error: error in file IO (code 11) at main.c(674) [Receiver=3.1.0]
> + true
> + [ 0 -eq 0 ]
> + ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o
> CheckHostIP=no -i /var/tmp/adt/disks/adtkey -p 54323 -tt -o BatchMode=yes
> -l ubuntu localhost sudo poweroff
> Warning: Permanently added '[localhost]:54323' (ECDSA) to the list of
> known hosts.
> Connection to localhost closed.
> + exit 4
> + on_exit
> + log_info_msg Cleaning up
> + log_msg Info: Cleaning up\n
> + date +%F %X
> + printf 2014-01-29 07:40:59: Info: Cleaning up\n
> 2014-01-29 07:40:59: Info: Cleaning up
> + [ -f
> /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img.pid ]
> + cat
> /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img.pid
> + kill -9 2077
> + rm -f
> /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img.pid
> + rm -f /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img
> + rm -f
> /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img.monitor
> /var/tmp/adt/disks/run/saucy-amd64-maas-20140129_072133.9SVDNK.img.serial
> + rm -f /var/lock/adt/ssh.54323.lock
> + rm -f /var/lock/adt/vnc.5911.lock
> + [ -d /tmp/adt-amd64.BykFVM ]
> + rm -Rf /tmp/adt-amd64.BykFVM
> + rm -f /var/tmp/adt/disks/run/saucy-amd64-maas-*.img*
> + find /var/lock/adt -name *.lock -mtime +1
> + exit 4
> Build step 'Execute shell' marked build as failure
> Archiving artifacts
> Email was triggered for: Failure
> Sending email for trigger: Failure
>
>
> --
> Mailing list: https://launchpad.net/~maas-builds
> Post to : maas-builds@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maas-builds
> More help : https://help.launchpad.net/ListHelp
>
>
--
Diogo M. Matsubara
Follow ups
References