← Back to team overview

ubuntu-translations-coordinators team mailing list archive

[Bug 1477225] [NEW] ceph-radosgw restart fails

 

You have been subscribed to a public bug:

Upstream Bug: http://tracker.ceph.com/issues/11140

[Impact]

On 14.04 the restart target of the sysvinit script brings the service down
but sometimes fails to bring the service back up again. There is a race between stop and start and in the failure case the attempt to bring the service up runs before the service has been stopped and the start command is never issued:

The proposed fix updates /etc/init.d/radosgw so that the stop target
waits for up to 30 seconds for the service to stop cleanly

[Test Case]

Bundle:

openstack-services:
  services:
    mysql:
      branch: lp:~openstack-charmers/charms/trusty/percona-cluster/next
      constraints: mem=1G
      options:
        dataset-size: 50%
    ceph:
      branch: lp:~openstack-charmers/charms/trusty/ceph/next
      num_units: 3
      constraints: mem=1G
      options:
        monitor-count: 3
        fsid: 6547bd3e-1397-11e2-82e5-53567c8d32dc
        monitor-secret: AQCXrnZQwI7KGBAAiPofmKEXKxu5bUzoYLVkbQ==
        osd-devices: /dev/vdb
        osd-reformat: "yes"
        ephemeral-unmount: /mnt
    keystone:
      branch: lp:~openstack-charmers/charms/trusty/keystone/next
      constraints: mem=1G
      options:
        admin-password: openstack
        admin-token: ubuntutesting
    ceph-radosgw:
      branch: lp:~openstack-charmers/charms/trusty/ceph-radosgw/next
      options:
        use-embedded-webserver: True
  relations:
    - [ keystone, mysql ]
    - [ ceph-radosgw, keystone ]
    - [ ceph-radosgw, ceph ]
# kilo
trusty-kilo:
  inherits: openstack-services
  series: trusty
  overrides:
    openstack-origin: cloud:trusty-kilo
    source: cloud:trusty-kilo
trusty-icehouse:
  inherits: openstack-services
  series: trusty

$ juju-deployer -c next.yaml trusty-icehouse
$ juju ssh ceph-radosgw/0
$ sudo su -
# service radosgw status
/usr/bin/radosgw is running.
# service radosgw restart
Starting client.radosgw.gateway...
/usr/bin/radosgw already running.
/usr/bin/radosgw is running.
# service radosgw status
/usr/bin/radosgw is not running.
# apt-cache policy radosgw
radosgw:
  Installed: 0.80.10-0ubuntu0.14.04.1
  Candidate: 0.80.10-0ubuntu0.14.04.1
  Version table:
 *** 0.80.10-0ubuntu0.14.04.1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     0.79-0ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
root@juju-lytrusty-machine-4:~#

[Regression Potential]

 * The only change in behaviour that would result from this change is that
   running the stop target in the init script will wait for up to 30s before
   exiting rather than retuning immediatly. I cannot think of any use cases
   where this would be an issue.

[Original Bug Report]
job handler:
Jul 22 16:03:44 job-handler-1 ERR Failed to execute job: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph@xxxxxxxxxx to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/model/activity/jobrunner.py", line 38, in run#012 yield self._run_activity(account_id, activity_id)#012HTTPError: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph@xxxxxxxxxx to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'

Other logs attached.

** Affects: ceph (Ubuntu)
     Importance: High
     Assignee: James Page (james-page)
         Status: Fix Released

** Affects: ceph (Ubuntu Trusty)
     Importance: High
     Assignee: Liam Young (gnuoy)
         Status: Fix Released

** Affects: ceph (Ubuntu Vivid)
     Importance: High
     Assignee: James Page (james-page)
         Status: Fix Released

** Affects: ceph (Ubuntu Wily)
     Importance: High
     Assignee: James Page (james-page)
         Status: Fix Released

** Affects: ubuntu-translations
     Importance: High
         Status: Invalid


** Tags: cloud-install-failure cpec kanban-cross-team verification-done
-- 
ceph-radosgw restart fails
https://bugs.launchpad.net/bugs/1477225
You received this bug notification because you are a member of Ubuntu Translations Coordinators, which is subscribed to Ubuntu Translations.