← Back to team overview

launchpad-dev team mailing list archive

stories for external test helpers (librarian, memcache, etc)

 

I've hit my tolerance level for stale librarian processes and am
looking to address this in the test environment. I want to make sure I
preserve existing use cases - please tell me if I have missed any.

1) Run an external process for the life of *a* test
* create a working config for it
* start it
* run test
* kill it
* clean up any overhead left in the work environment [that we care about]

2) For many tests
* create working config
* start it
-run tests-
 * check its still ok and do per-test isolation
 * run a test
* kill it
* clean up

3) For 'make run'
* Use a small commandline tool that
 * creates working config
 * starts it
 * waits for SIGINT
 * stops it
 * cleans up

4) for concurrent testing
* As for many tests, but the creation of a working config needs to be
safe in the presence of concurrent activity.
* The created config needs to be sharable with *other* external
processes (e.g. the buildmaster may want to talk to the librarian)

5) For low-overhead iteration
* Find an existing external process
* Must 'know' its config a-priori
-run tests-
 * check the process is running, do per-test isolation
 * run a test

6) Start a particular server in production
 * I think we should probably -not- have this as a use case: server
management, rotation, graceful setup and tear down are much more
complex than in a testing environment. Instead we may need some
supporting logic around this, in the server bring up/tear down code,
but at least for now that should be considered a separate problem.


If the above set is complete, then I am proposing to combine things in
the following way.
Firstly, because its a good building block, the make run use case.
Note that the current code is duplicative/overlapping with the test
helper code - I'm proposing to consolidate this. This shouldn't look
too different to our runlaunchpad Service today, except that we'll
have more entry points (to do cleanup etc).
 - the small helper will do the following for a given service:
   start up the instance
   optionally print a bash snippet with variables (like ssh-agent
does), including the helpers pid
     - this is useful for running up isolated copies
 - main process runs

This lets us capture useful things from starting it off without
needing a well known location a-priori.

We can even 'run' postgresql in this fashion, and have it return the
DB name to use.


Now, the cheap test iteration case can be addressed:
 - devs run eval `bin/run -i test --daemonise`
   - this outputs all the variables for all the servers started.
 - test code looks for a *per-service* information about pid files etc.
   e.g. LP_LIBRARIAN_PIDFILE and LP_LIBRARIAN_CONFIGFILE rather than
LP_PERSISTENT_TEST_SERVICES
 - to kill, eval `bin/test-servers --stop`
   (Which will kill the daemonised wrapper, and unset the environment
variables).
 - If LP_PERSISTENT_TEST_SERVICES is set and a service isn't running,
I propose to error, because I think it usefully indicates a bug in
that external process, and this is easier than detecting both 'not
started yet' and 'started but crashed' - especially given the test
runners tendancy to fork sub runners.


Concurrent testing then is easy: as long as all the fixtures are
meeting this contract, if the *default* behaviour is to  bring up a
unique instance, everything will come up fine.


Note that in this model there is never a need to do more than 'kill
helper-pid' to shut something down: that helper pid will encapsulate
all the cleanup logic, kill-9ing, dropdb of temporary dbs etc, and the
helper code should be simple and robust. This will help give us a
simple, robust interface.


In the python code, I think something like the following will do:

class ExternalService(Fixture):
    """An external service used by Launchpad.

    :ivar service_config: A config fragment with the variables for this
        service.
    :ivar service_label: the label for the service. e.g. LIBRARIAN.
Used in generating
        environment variable names.
    """

    def setUp(self, use_existing=False, unique_instance=True):
        """Setup an external service.

        :param use_existing: If True, look for an use an existing instance.
        :param unique_instance: If False use the LP_CONFIG service definition.
            Otherwise, create a new service definition for this
instance, which can
            be found on self.service_config
        """

    def reset(self):
        """Ensure the service is running and ready for another client.

        Any state accumulated since setUp is discarded or ignored
(which is up to the service implementation).
        """

    def cleanUp(self):
        """Shutdown the service and remove any state created as part
of setUp."""


The wrapper helper becomes something like the following (but with
private stdout/stderr to avoid race conditions and console spew).

def wrapper_process(service):
    pid = os.fork()
    if pid:
        wait_for_ok(..)
        print "LP_%s_CONTROL_PID %d" % (service.service_label, pid)
        exit()
    service.setUp()
    try:
        print "LP_%S_CONFIGFILE %s" % (service.service_label,
stash_config(service.service_config))
        signal.pause()
    finally:
        service.cleanUp()


Note that reset() with persistent test services is obviously a little
more limited.

What do you think?

(note that atexit is equivalent to the above code, its not at all
useful except when a normal unwind occurs).

-Rob



Follow ups