← Back to team overview

openstack-qa-team team mailing list archive

Re: Thoughts on input fuzzing tests

 

On 06/12/2012 04:00 PM, Jay Pipes wrote:
On 06/12/2012 02:22 PM, Daryl Walleck wrote:
Due to the large number of input fuzzing tests that have been
submitted, I've been thinking of ways to reduce the amount of code
needed to achieve this (whether we should do it or not is a totally
different discussion). Rather than have x number of input tests for
say create server, wouldn't it be far easier to have a single create
server fuzz test which is data driven and accepts the desired inputs
and the expected exception. So instead of this (pseudo-coded things up
a bit):

https://gist.github.com/2919066

we could get the same effect with much less code by doing this:

https://gist.github.com/2919177

Regardless of implementation, I think the general idea of moving this
type of testing towards data driven functions would really help cut
down on redundant code.

Fuzz testing I believe is better done using a tool like randgen [1]

The basic strategy is to have a grammar document that describes the API
and then have a fuzz tester hammer the API with random bad and good
data, recording responses.

I've cc'd Patrick Crews, who is an expert on randgen and also works on
the OpenStack CI team, to see if he'd be interested in participating in
putting together a randgen grammar for OpenStack components and working
on some fuzz testing stuff in Tempest...

++ to this. I've been thinking about testing in this area since the SF dev summit : )

As an fyi, the randgen is a tool developed for testing database systems. In order to cover large amounts of ground quickly, the tool utilizes yacc-style grammars to define the 'playground'...the tool then randomly picks and chooses from the possibilities.

As an example a grammar file like:
query:
    SELECT * FROM _table WHERE column_name comparison_operator value ;

can produce a lot of queries like:
SELECT * FROM table1 WHERE col_int > 9;
SELECT * FROM table99 WHERE col_char <= 'abbazabba';
SELECT * FROM table20 WHERE col_int_key = 'YHGNZ';

The intent is to produce a 'map' that can generate lots of test points

It is deterministic (the same seed will always produce the same results) and can be tweaked (one can provide various --seed values to shake things up).

At present, we can easily produce a grammar that would generate various api calls. Execution and validation are another story. To elaborate, we could quickly produce a grammar that could generate text like:
create_instance(authorized_user, good_user_pw)
create_instance(unauthorized_user, good_user_pw)
delete_instance(instance_id)

Executing those against glance/nova/swift could take some additional work...either through hacking on the randgen itself or having some other tool do something with the generated calls. The tool is currently designed to execute against a database...in the randgen itself, it is the Executor modules (lib/GenTest/Executor) that we'd likely need to play with.

Validation is also a question: One of the tricks used in the database world is to run randomly generated tests against two systems - the database under test and a reference system (like running against both MySQL 5.1 and 5.5 for example...or the same version with different option settings, etc). Validation by hand is too time consuming and expensive, so having general guidelines for machine validation is the way to go.


The simple fact is that the more "negative" tests we add to Tempest's
test suite, the longer Tempest is taking to run, and we are getting
diminishing returns in terms of the benefit of each added negative test
compared with the additional time to test. I think having a separate
fuzz testing suite that uses a grammar-based test will likely produce us
better negative API test coverage without having to write single test
methods for every conceivable variation of a bad request to an API.

++ to this. In the database world, Microsoft's SQL server team threw in the towel on manual testing as a base - it is too expensive to generate and validate (and maintain) such tests...building a bank of combinations that can be automatically executed and validated is the way to go when we have so much ground to cover.

Looking forward to chatting with anyone who is interested in this. Will be wrapping up some tasks this week and will be able to dig into this seriously next week. However, anyone can ping me in IRC (pcrews)/ email if they'd like to discuss / explore this further.

Thanks for bringing this up, Jay!

Cheers,
patrick

Best,
-jay

[1] https://launchpad.net/randgen





Follow ups

References