← Back to team overview

cf-charmers team mailing list archive

Re: Working deployment

 

The orchestrator unit (cloudfoundry/0) has to download (pre-cache) all
of the artifacts from AWS to be able to deploy them to the other
charms.  It has logic to retry those downloads as much as possible,
but if it fails, it will abort because none of the other charms will
work without those artifacts.  Note that these artifacts can be
several GB, hence the 20G root-disk constraint.  Also, they will take
a long time download from the S3 bucket.  Note that if any charms get
to "error" state due to a failed download, you can just `juju resolved
--retry <unit>` to try the download again.

But it sounds like you got past that step, eventually.  The
OrchestratorRelation is only added once all of the dependent charms
are deployed and make it to the "started" state.  If any of the
sub-charms are in "error," "pending," or any other non-"started"
state, the OrchestratorRelation will not be added to any of the
sub-charms and they will all block on "Incomplete relation:
OrchestratorRelation".

Check in `juju status` (or `juju pprint` if you have that plugin
installed; it is recommended) to see if any units are in "error" state
or stuck in "pending".  Since it looks like you're deploying to LXC, I
think you might be running into this Juju Core bug:
https://bugs.launchpad.net/juju-core/+bug/1354027

To be honest, we've only tested it so far on AWS.  It avoids the
aforementioned Juju Core bug, and downloading the artifacts is *much*
faster on AWS, since it stays within Amazon's internal network.  There
are two items that we were waiting on before testing on LXC: 1) adding
a density option, to co-locate most or all of the charms, 2) splitting
the artifacts to match the BOSH packages more 1-to-1 to reduce
redundancy in the pre-fetch cache and reduce download and disk-space
usage.

Another change in the works to help improve robustness is switching
from the orchestrator managing the deployment directly in the hooks to
it being handled by a reconciler service that can be more proactive
regarding retries and other topology-healing actions.

TL;DR: You'd have better luck testing this on AWS for the time being.

On Fri, Aug 15, 2014 at 6:02 AM, Alexander Lomov <lomov.as@xxxxxxxxx> wrote:
> Corry, I've updated the charm to the last version (revision #126) and tried
> to deploy it.
>
> Here is a script that shows how I've cleaned environment and run deployment:
> https://gist.github.com/allomov/bfcad6ceed0740eb41c3 .
>
> Unfortunately, it didn't work. I can't see any haproxy and router processes
> inside containers. Tailing logs shows out this lines:
>
> 2014-08-15 00:58:47 DEBUG juju-log nats:15: Incomplete relation:
> OrchestratorRelation
> 2014-08-15 00:58:59 DEBUG juju-log ltc:35: Incomplete relation:
> OrchestratorRelation
> 2014-08-15 00:59:02 DEBUG juju-log ltc:35: Incomplete relation:
> OrchestratorRelation
> 2014-08-15 00:59:04 DEBUG juju-log ltc:35: Incomplete relation:
> OrchestratorRelation
>
> By the way `config-changed` hook error can be caused by network error.
>
>
> Thank you,
> Alex L.
>
>
>
>
>
> On 14 August 2014 13:17, Alexander Lomov <lomov.as@xxxxxxxxx> wrote:
>>
>> Hi, Corry. Nice to hear you back.
>>
>> Just to let you know, I've updated cf.yml every time in my tests.
>>
>> Anyway I will try it again with latest version of the charm.
>>
>> Best wishes,
>> Alex L.
>>
>>
>> On 13 August 2014 23:25, Cory Johns <cory.johns@xxxxxxxxxxxxx> wrote:
>>>
>>> I should also note that, because it pre-caches all of the artifacts,
>>> the main cloudfoundry charm unit needs at least around 12GB of disk
>>> space.  The constraint should be added to the README.
>>>
>>> On Wed, Aug 13, 2014 at 4:22 PM, Cory Johns <cory.johns@xxxxxxxxxxxxx>
>>> wrote:
>>> > Fix pushed.
>>> >
>>> > On Wed, Aug 13, 2014 at 3:57 PM, Cory Johns <cory.johns@xxxxxxxxxxxxx>
>>> > wrote:
>>> >> Sorry for the delay in the response; we had some overlap in people
>>> >> being out recently.
>>> >>
>>> >> It looks like that error is caused by not setting the admin_secret
>>> >> value in the config.  It should be more gracefully handled, but if you
>>> >> follow the instructions in the README.md for generating the config.yml
>>> >> file prior to deploying the charm, you can work around it.
>>> >>
>>> >> I will push a fix for handling it more gracefully in a few minutes.
>>> >>
>>> >> On Fri, Aug 8, 2014 at 2:08 AM,  <prismakov@xxxxxxxxx> wrote:
>>> >>> Hi guys,
>>> >>>
>>> >>> Ubuntu 14.04.1 LTS
>>> >>> local juju environment
>>> >>> cloudfoundry charm rev 115
>>> >>>
>>> >>>  cloudfoundry:
>>> >>>    charm: local:trusty/cloudfoundry-0
>>> >>>    exposed: false
>>> >>>    units:
>>> >>>      cloudfoundry/0:
>>> >>>        agent-state: error
>>> >>>        agent-state-info: 'hook failed: "config-changed"'
>>> >>>        agent-version: 1.20.1.1
>>> >>>        machine: "1"
>>> >>>        public-address: 10.0.3.226
>>> >>>
>>> >>>
>>> >>>
>>> >>> 2014-08-08 05:50:10 INFO config-changed Traceback (most recent call
>>> >>> last):
>>> >>> 2014-08-08 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/config-changed", line
>>> >>> 145, in <module> 2014-08-08 05:50:10 INFO config-changed manage()
>>> >>> 2014-08-08
>>> >>> 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/config-changed", line
>>> >>> 139, in manage 2014-08-08 05:50:10 INFO config-changed
>>> >>> manager.manage()
>>> >>> 2014-08-08 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/charmhelpers/core/services.py",
>>> >>> line 113, in manage 2014-08-08 05:50:10 INFO config-changed
>>> >>> self.provide_data() 2014-08-08 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/charmhelpers/core/services.py",
>>> >>> line 119, in provide_data 2014-08-08 05:50:10 INFO config-changed
>>> >>> data =
>>> >>> provider.provide_data() 2014-08-08 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/cloudfoundry/contexts.py",
>>> >>> line 383, in provide_data 2014-08-08 05:50:10 INFO config-changed
>>> >>> 'domain':
>>> >>> self.get_domain(), 2014-08-08 05:50:10 INFO config-changed File
>>> >>>
>>> >>> "/var/lib/juju/agents/unit-cloudfoundry-0/charm/hooks/cloudfoundry/contexts.py",
>>> >>> line 351, in get_domain 2014-08-08 05:50:10 INFO config-changed env =
>>> >>> APIEnvironment(creds['api_address'], creds['api_password'])
>>> >>> 2014-08-08
>>> >>> 05:50:10 INFO config-changed KeyError: 'api_address' 2014-08-08
>>> >>> 05:50:11
>>> >>> ERROR juju.worker.uniter uniter.go:486 hook failed: exit status 1
>>> >>>
>>> >>> - Alex P.
>>> >>>
>>> >>>
>>> >>> On Aug 7, 2014, at 20:28, Alexander Lomov <lomov.as@xxxxxxxxx> wrote:
>>> >>>
>>> >>> Hi, Ben.
>>> >>>
>>> >>> I like the way cloudfoundry charm is deployed now. Working with
>>> >>> bosh-template is good idea.
>>> >>>
>>> >>> Unfortunately I wasn't able to deploy it yesterday using standard
>>> >>> instructions from README file. For the first sight the reason is in
>>> >>> the
>>> >>> router, router process isn't started. Today will try with updated
>>> >>> charm.
>>> >>>
>>> >>> Best wishes,
>>> >>> Alex L.
>>> >>>
>>> >>>
>>> >>>
>>> >>> On 1 August 2014 03:30, Benjamin Saller
>>> >>> <benjamin.saller@xxxxxxxxxxxxx>
>>> >>> wrote:
>>> >>>>
>>> >>>> Barring some minor issues we have a recent (!73) deployment working
>>> >>>> in
>>> >>>> that we've been able to deploy some basic apps.
>>> >>>>
>>> >>>> This is a big milestone for us. Thanks to the team and thanks to Jim
>>> >>>> and
>>> >>>> company for helping us to resolve a couple of facepalm issues
>>> >>>> stemming from
>>> >>>> differences in the trusty cloud image and stem cells (very minor but
>>> >>>> hard to
>>> >>>> identify).
>>> >>>>
>>> >>>> We still have an open issue with cgroups memory quota application
>>> >>>> which
>>> >>>> we've patched around for the time being, and will work towards a
>>> >>>> long lived
>>> >>>> fix in the near term.
>>> >>>>
>>> >>>> We will also record and publish the costs/difficulty of adding
>>> >>>> support for
>>> >>>> 175/176 cf-releases, we don't expect any real issues there as the
>>> >>>> current
>>> >>>> codebase is designed around handling this type of change.
>>> >>>>
>>> >>>> I'll be in Germany representing the Canonical teams work to some of
>>> >>>> our
>>> >>>> internal stakeholders next week but we expect to be able to push
>>> >>>> things
>>> >>>> forward faster now that we have a working base.
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Ben
>>> >>>>
>>> >>>> --
>>> >>>> Mailing list: https://launchpad.net/~cf-charmers
>>> >>>> Post to     : cf-charmers@xxxxxxxxxxxxxxxxxxx
>>> >>>> Unsubscribe : https://launchpad.net/~cf-charmers
>>> >>>> More help   : https://help.launchpad.net/ListHelp
>>> >>>>
>>> >>>
>>> >>> --
>>> >>> Mailing list: https://launchpad.net/~cf-charmers
>>> >>> Post to     : cf-charmers@xxxxxxxxxxxxxxxxxxx
>>> >>> Unsubscribe : https://launchpad.net/~cf-charmers
>>> >>> More help   : https://help.launchpad.net/ListHelp
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Mailing list: https://launchpad.net/~cf-charmers
>>> >>> Post to     : cf-charmers@xxxxxxxxxxxxxxxxxxx
>>> >>> Unsubscribe : https://launchpad.net/~cf-charmers
>>> >>> More help   : https://help.launchpad.net/ListHelp
>>> >>>
>>
>>
>


References