Qa Agile

Mirror of: http://blogs.msdn.com/jeffdw/

Our single biggest misstep in our test approach last version was that we failed to sufficiently change our testing methodologies to fit the Agile environment under which we developed our V1 product. As we had done in our previous years here at MS, we placed our emphasis on system-level, UI-based test automation, which had served us well in the past when working under more of a waterfall model.

So, what did our test approach look like heading into V1? Well, let me try to encapsulate a book-sized chunk of info into a succinct, semi-directed list:

  • Receive the initial feature specs
  • Review and provide feedback on the spec
  • Create a comprehensive test plan based on the spec. The focus of the test plan is at the system level, meaning that it is abstracted more or less at an end user action level
  • Determine the supporting test libraries, both UI and non-UI, to be authored
  • Author supporting libraries as you author automation. This is a labor-intensive process, particularly for the UI libraries as they are not only complex, but also fragile, being highly succeptible to product changes
  • Author test automation
  • Excecute recurring test runs in a test management environment upon which our automation was entirely dependent
  • Repair automation as product changes occur
  • Expand and further abstract the supporting libraries to enable model-based testing tools to generate broad suites of tests to be used as static regression tests.

We found that following this traditional, system-level testing approach our automation was consistently about a sprint behind feature implementation. This gap led to many unintended consequences:

For

  • about a sprint, there was little to no consistent QA automation backing up some features, meaning that serious breaks could go uncaught for far too long
  • QA and Dev were at different points in the feature development timeline, severely dampening critical QA/Dev person-to-person conversations. Simply, Devs were focusing on implementing new features while QA was still looking at the 'old stuff' in an effort to get automation in place
  • Changes made to the product meant that QA would, at times, spend time working on supporting libraries that were now out of date
  • QA constantly felt behind, putting a damper on the amount of time that many QA feature owners devoted to critical hands-on testing. Getting the automation in place and keeping it running was soaking up too much time.

As we moved on to V2 and looked back at the lessons learned, it was clear that we needed to make a number of bold steps:

  • QA and Dev owners of the same feature must work in a far more collaborative fashion and develop strong interpersonal relationships and a sense of joint ownership of their feature. This was accomplished by some owners in V1, but we needed it to be more broadly true in the organization.
  • QA cannot afford to focus all, or even the majority, of its testing efforts at a system level as can be done in a waterfall environment. Instead, much of QA's efforts must be at the subsystem level.
  • When a new feature comes online or is significantly altered, QA needs to get some high-pri tests automated immediately before moving on to broader testing.
  • The overhead of test automation must be made far lighter to free up time for additional QA hands-on testing early in a feature's life and QA management must ensure that such testing is a highly valued priority. To lighten the overhead, use of UI-driven automation is avoided, instead leveraging API testing and DTE testing.
  • To further strengthen Dev/QA synergy, all Dev unit tests and QA tests are run together on a nightly basis.

The P0 test:

The initial focus of test automation is what we call the 'P0' test. This is the quick-turnaround automation that ensures that QA and Dev are in synch, gets regression automation in place quickly, and seeks to push QA/Dev to stay in synch through the cycle. Here are the basic requirements for P0 tests:

  • They should be automated and in place within 1 day after a feature or feature change appears in the build. In order to make this happen, P0s must be few in number (this is by design) and Dev must give QA a heads-up on the changes being made before they are checked in.
  • P0 tests must be associated with the dev workitem(s) that they cover. This is recorded in the tests' metadata
  • P0 tests must have been agreed to by both Dev and QA before they are implemented. Both should agree that failure of the test represents a critical failure. If not, the scenario envisioned is likely a P1 or P2 test (I'll cover those in another posting)
  • P0 tests must not duplicate Dev unit tests but, instead, be at a bit more abstract level, what we call 'API integration testing'. What does this mean? Well, if you have methods A, B, and C, Dev should have unit tests for each; however, QA P0 tests will cover how A, B, and C interact. B may appear to work correctly and C may appear to work correctly, but if C is dependent upon upstream action from B and that integration point is faulty, QA should catch it.
  • All P0 tests are run on a nightly basis. Many P0 tests are also labeled as being 'check-in' tests and must be run by Dev before any code check-in takes place.

This is a 100k foot view of something that we've spent a lot of time on and, as I read the post, can see that it may be confusing in many ways on its own. As I follow-up with more info on other testing that we are doing, though, I hope that the place of P0 testing is put into proper context

Of course, you're going to need to expand both the breadth and depth of scenarios that you hit in the P0 tests. There are a number of ways that QA can go about this. Here are a couple of the approaches we've taken before:

The classic approach is to write up a detailed test plan, including all of the test cases you plan to automate. Once you have a few tests in place, you continually clone them and alter them, somewhat, for each new scenario. By the time you're done you have hundreds, if not thousands, of nearly-identical tests that have minor differences to cover various scenarios. The drawbacks of this approach are many, and in many ways are all due to having insufficient abstraction in your test automation:

  • If the feature design changes in such a way as to make the tests invalid, you must go in and repair hundreds, if not thousands, of tests. "Clone and alter" is very fragile
  • The logical 'input data' for the test is hard-coded into the test itself. This not only means the test isn't flexible, but also that it is extremely difficult to review what is and is not being covered by your tests. It's bad engineering to merge your data and logic, but QA does it all the time.
  • Nobody outside of QA is really able to improve the test suite since devs are unlikely to go about cloning and altering tests that a QA owner has.
  • Utilize model-based testing (MBT) to maximize abstraction and code reuse as you generate a broad set of product tests. We've tried 3 different MBT tools over the last few product cycles I've been in. We've had some success, and there's no doubt that the approach forces greater test abstraction; nevertheless, we've moved away from using MBT for a few key reasons:
  • Individual tests become almost indecipherable to those who did not work on them
  • In order to avoid 'model explosion' (your model spits out tens of thousands of tests), you have to choke the model to the point that you lose key benefits of MBT
  • Test data is somewhat abstracted from individual tests, but the data sets are usually not easily accessible and often not very granular.
  • Using MBT you REALLY can't clearly explain to Dev what the purpose of each test is.
  • Getting a model going is very time-consuming. The approach is simply not at all agile.

OK, with all of this said (or, perhaps, resaid as I mentioned several of these points in earlier posts), what approach are we taking to meet our key goals of agility, data separation, code reuse, and transparency to other disciplines? Well, most all of our P1 tests are data-driven both in terms of input data as well as verification.

Some of you may have used data-driven testing through nUnit or VSTT in which you associate a test with a data-source for input data. That data source may be a table in SQL Server, an Excel spreadsheet, or, perhaps, an .mdb. We're taking this aproach a step or two further and, so far, we're seeing improved results across QA than we've seen in previous cycles and much better feedback from Dev as that discipline feels like it understands what it is that we're testing.

Our QA team has created and continues to expand and maintain a single SQL Server 2005 database dedicated to our team's data-driven testing. Within this database we have created a variety of schemas, one per feature area and one schema for shared data. Tables within the database are of two basic types, input data and verification data.

Input data tables contain all data and associated metadata (I know this sounds cryptic, but in subsequent posts I'll give specific details on some of the tables that we use in our team). Our QA feature owners are expected not only to cover all the scenarios they can with data in their input tables, but also to ensure that Dev and PM not only have the chance to review those data but also to encourage them to input additional data for the tests as they identify it. This focus on input data, rather than on imperitive test logic, means that Dev, PM, and other QA team members who don't own a given feature can contribute to the test coverage of that feature simply by knowing the breadth of input data that a feature should support.

Verification tables all have a foreign key relationship to an input data table. The data contained in a verification table may be pure baselines (a bad idea), or an abstraction such as RegEx, an XML representation of expected object state, or whatever. This means that a single input data table may have multiple associated Verification tables (in other words, there is a 1…n relationship between input tables and verification tables), each of which may verify some of the same data in different ways for different test cases. For most test cases, the input test data used in the test is limited to those data having associated verification.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License