We demystified the automation puzzle. Relentless validation tamed!

A large global provider of BI solutions has a product suite that runs on five platforms supporting thirteen languages with each platform suite requiring multiple machines to deliver the BI solution. The entire multi-platform suite is released on single CD multiple times a year.

The problem that stumped them was “how to automate the final-install validation of multi-platform distributed product”. They had automated the testing of the individual components using SilkTest, but they were challenged with “how to unify this and run off a central console on various platforms at the same time”.

Considering each platform-combination took about a day, this required approximate two months of final installation build validation, and by the time they were done with this release, the next release was waiting! This was a relentless exercise, consuming significant QA bandwidth and time, and did not allow the team to do things more interesting or important.

The senior Management wanted single-push-button automation -identify what platform combination to schedule next, allocate machine automatically from the server farm, install and configure automatically, fire the appropriate Silk scripts and monitor progress to significantly reduce time and cost by lowering QA bandwidth involved in this effort. After deep analysis, in-house QA team decided this was a fairly complex automation puzzle and required a specialist! This is when where we were brought in.

After an intense deep-dive lasting about four weeks, we came up with a custom master-slave based test infrastructure architecture that allowed a central console to schedule various jobs onto the slaves, utilizing a custom developed control & monitoring protocol. The solution was built using Java-Swing, Perl, Expect and adapters to handle Silk scripts. Some parts of the solution where on Windows platform while some on UNIX. This custom infrastructure allowed for scheduling parallel test runs, automatic allocation of machines from a server farm, installing appropriate components on appropriate machines, configuring them and finally monitoring the progress of validation through a web console.

This test infrastructure enabled a significant reduction of the multi-platform configuration validation. The effort reduced from eight weeks to three weeks. We enjoyed this work simply because it was indeed a boutique work fraught with quite a few challenges. We believe that this was possible because we analyzed the challenging problem from wearing a development hat and not the functional test automation hat.

Re-architecting test assets increases test coverage by 250%

This company is an innovative online banking solution provider, having three major products catering to over 100 top financial institutions(FI) of the world including the top five FI in the world. They have a very successful product line, growing rapidly, with major releases approximately every year, incorporating new features to cater to the various needs of the market place. As the code base evolved, the test assets were also modified to reflect the changed product. The challenge faced was that the most of the test cases were passing and the rate of uncovering new defects was low.

The product became huge and the company decided to re-architect the product in order to enable rapid feature addition with low risk. That is when the company decided to re-look at its test assets and re-architect the same to increase the test coverage, improve defect finding ability and ensure that the test assets were future-proof. It had about 8000 test cases then.

We were chartered to analyze the existing test cases for completeness and modifiability and re-architect the same after filling the gaps and to ensure that the future test cases were easily pluggable. Applying STEM, we performed a thorough assessment of the existing test assets and discovered holes in the same. Using the STEM Test Case Architecture (STEM-TCA), we re-engineered the test cases by firstly grouping them into features, then by levels of tests and segregating into various types of tests and then finally by separating into positive and negative test cases. During this process of fitment of existing of test cases into the STEM-TCA, we uncovered quite a few holes. These were filled by STAG by designing 5000 test cases additionally. Not only did the STEM-TCA increase the test coverage by uncovering the missing test cases, it also provided a sharper visibility of the quality as the test cases were well organized by specific defect types. This improved the test coverage by about 250% and the technical management staff were confident about the adequacy of test assets and were also convinced about its future upgradeability and maintainability.

10x reduction in post release defects – Finest experience of applying STEM

This is about one of the finest experiences that STAG had with a global chip major where the early implementation of STEM yielded significant results. Our engagement with them was to setup an effective validation practice for their porting level API for video decoders. The customer had a great technical team and were involved in both development and QA. The challenge that they faced with their complex product that involved both hardware and software and later system integration on multiple real time OS on various platforms, was that of high defect escapes i.e. post-release field defects.

We spent about a month understanding their domain and the associated technologies. Post this, a detailed analysis yielded interesting data – test cases were primarily conformance oriented, coverage of test cases was suspect, escaped defects seem to propagate from early stages and finally the process of validation was loose.

Having understood the types of defects that were being found and the post-release defects, we figured out the various types of probable defects and the various combinatorial aspects that need to be considered to form a test case. We then staged the validation as consisting of three major levels, the first one at API level, then the next one at a system level, and the last level made up of a customer-centric level that involved using reference applications.

Applying the STEM approach to test design, the test cases were developed, yielding about 6000 test cases at level one and about 800 at the subsequent levels. Also, whereas the ratio of +ve vs –ve test cases was earlier towards the +ve side, after our re-design, the ratio shifted to 60%:40% at the lower level and about 85% :15% at the higher levels. Moreover, the test cases increased in number significantly by a factor of 1000%, allowing for a larger and deeper net to catch many more serious defects. Over the next 9 months, the rate and number of defects detected increased dramatically, resulting in post-release issues reducing by a jaw-dropping 10x times.

Once we solved the test effectiveness problem and increased the yield of defects, the focus shifted to streamlining the process by setting up proper gating in the test process and creating a centralized web based test repository, and finally setting a strong defect analysis system based on Orthogonal Defect Classification (ODC) method. This enabled a strong feedback system, resulting in shifting the defect finding process to earlier stages of SDLC and thereby lowering cycle time. Complementing this, we focused on setting up a custom tooling framework for automating this non-UI based software resulting in a significant cycle time reduction – an entire cycle of tests on a platform took less than 15 hours of time!

This has been one of the finest experiences that we had with STEM, and was a clear winner for STEM implementation. This was only possible because of the very mature engineering management staff of the customer, who were focused on systemic improvement and had systematic improvement goals.

What is to be noted is that our test team was NOT a team with significant depth of experience on the particular product domain. Applying STEM at a personal level, the team was able to understand what was necessary and sufficient for effective validation and complemented the strong technical team with mature defect-oriented thinking. This was an early case study for us to establish that a STEM based approach provided us with the right thinking skills for defect finding, rather than resort to a domain centric approach to defect finding.

“The quality race” – STEM wins

A large German major with a mature QA practice is seeking new ways to improve its test practice. It all starts with a talk delivered at this company on “The Science & Engineering of Effective Testing” to its senior management staff and test practitioners in the company. We are amazed at the interest in this topic (75+ people attend the talk) and the enthusiastic response – we are deeply humbled.

A few weeks after this, the management decides to experiment with STEM-based approach to testing. They identify about twenty five people (a small subset of their QA team) to be trained on the new way of testing. We are delighted and conduct a 5-day workshop with intense application orientation, to enable them to understand STEM. The company then decides to conduct a bold experiment- a pilot to evaluate STEM powered approach to testing vis-a-vis their way of testing. They identify a product that is in use for a few years with consumers across the world. They decide to have two five-member identical teams consisting of similar mix of experience levels of people, each given a timeframe of one month to evaluate the new release of this product. These two teams are kept apart to ensure a controlled experiment and the countdown starts. We wait with bated breath…

The month is slow for us, but it flies for the two teams. Enormous data has been generated and the management analyses them thoroughly to spot the winner. A month later we are called by the senior management. We are sweating, have we won? A few minutes later, it is clear that STEM is a winner. The STEM powered team has designed 3x test cases compared to the non-STEM team and uncovered 2x number of defects! The icing on the cake is that the couple of defects uncovered by the STEM powered team are “residual defects” i.e. they have been latent in the product for over a year (Remember Minesweeper game on Windows) and one of them corrupts the entire data in the database. Now the discussion steers to effort/time analysis – Does application of STEM require more effort/time? The team has conclusive evidence that it is not significant, implying STEM has enabled them to think better, not work longer or more.

What enabled the STEM powered team to win the “Race of quality” ? The answers are given by the team itself, and we are delighted, as we have believed in them, and have seen results when we implement them. The top three reasons are: (1) The notion of Potential Defect Types (PDT) is powerful as it forces the team to hypothesize what can go wrong and enable them to setup a purposeful quality goal (2) PDT forces a thorough understanding of the customer expectations and the intended behavior of the product (3) PDT ensures that test design creates adequate test cases, thus eliminating defect escapes and paving the way for robust software.

The STAG team is delighted as their customer acknowledges the effectiveness of the STEM based approach. The team is convinced that STEM powered approach is a winner and is raring to run the marathon, with the customer also cheering them to win!

A heartfelt “Thank you” to the STEM powered team and the innovation-centered Senior Management of the company.

Large scale migration of automation suite – “Scaling the peak”

The customer in focus provides data integration software and services that empower organizations to access, integrate, and trust all its information assets, giving organizations a competitive advantage in today’s global information economy. As the independent data integration leader, the customer has a proven track record of success helping the world’s leading companies leverage all their information assets to grow revenues, improve profitability, and increase customer loyalty.

The customer had a large base of automated test scripts (1300) in Rational Visual Test (VT) and a single license of the Visual Test on a dedicated machine – a risky affair considering that the Visual Test is no more supported and relies on a older version of Windows. They decided to de-risk this to support the newer versions of products and also get rid of limitations existing in the existing test script suite. Some of the limitations of existing automation suite were: (1) Manual dependency to get the scripts run in suite against any new build released,(2) Non-existence of tool support for Visual Test (3) Limitations with respect to support on other flavors of windows operating system and support for internationalization (i18N) (4) Limitations in the framework to extend the scripts with new functional changes and (5) Difficulty in building competency on Visual Test. The management decided to migrate these VT scripts to Borland Silk Test Suite.

Now came the challenges that we had to solve. The test suite was large with only the script available and no documentation on the test cases. They were keen that the performance of the new suite in SilkTest be considerably enhanced. Technically the test suite was intensely data-driven with huge data set to drive the test suite. The automation run was l-o-n-g, ranging from 24-36 hours, necessitating that a robust recovery mechanism be in place to ensure uninterrupted runs with minimal baby-sitting. Since it was a new investment, the management was keen that the automation framework is indeed flexible so that new test cases could be added to the suite quickly. Finally, the suite had to cater to the different language packs of the product.

Phew – It was real challenge “scaling the peak”, and we had our share of issues of encountering bad weather, storms, landslides, but heck we made it! As always, the journey was arduous, but the rush of adrenalin after reaching the peak was great. We must confess that the journey would not been possible without the wholesome support and cooperation from the customer – Thank you.

Now the details of the climb! We had to analyze the large VT suite to understand the structure, flow, data inter-relationship and the finer nuances. Remember that we only had the scripts machine to try and understand! The key learning points and the action items were:

1. Build an effective data-driven mechanism to provide the flexibility to add and maintain test data in external SQL tables.

2. Implement robust delivery mechanism to enable the scripts to run uninterrupted for long durations, upwards of 24 hours.

3. Support for internationalization to enable testing of English and Japanese language packs via external language property files.

4. The ability to add new test cases to the existing framework to support new features and application changes with 50% less effort.

We approached the problem of scaling the peak, by firstly going through a strong intellectual process of technical problem analysis and devising a library-based & data-driven framework and subsequently putting together a factory-driven approach to rapidly code the scripts. Once we architected the custom framework, we identified with the “good principles of development”, such as avoiding global variables, avoiding hardcoded information, level & depth of documentation, language coding conventions, and finally object referencing and de-referencing strategies. The development process was iterative with multiple milestones identified and acceptance criteria clearly identified.

A skeletal team of architects and specialists got cracking on the problem, making the first move to built a flexible and robust automation framework. They also commenced development of common library components, that will be used to by the larger extended team later. Once the architecture custom framework was in place, coding standards were enforced, and then these activities of coding the framework and the library components were done. At this point, a larger team was assembled, each of them was assigned a certain set of scripts to be converted from VT to SilkTest. The act of coding the newer SilkTest scripts was individually done on developer machines, code-reviewed and then later integrated on a test machine, and tested by running this on the target application. We did encounter a stormy climb, with myriad integration issues popping up, each of them was solved and we continued to make good progress.

The D-day came and we were delighted to hoist the flag on the peak! We had covered good ground, generating approximately 50,000 LOC with about fifth of that constituting the framework level component code. The cool air at the peak was refreshing and sweet! – We had reduced the cycle time by approximately 80% i.e. from FIVE days to just ONE day, were able to long runs of 24 hours without issues, switched language pack with ease and were able to add a new set of 120 test cases quickly enough. In the subsequent two months of intense usage only about five issues were reported, which were fixed.

This was a unique project, where we had to migrate automation code from one commercial tool to another with constraints of documentation and machine availability. It was indeed a pleasure to work with a demanding customer, who worked closely with us to help us understand the product and the VT automation, and also making available a dedicated integration machine with the lone VT server machine.

We have always enjoyed challenges, and we thank the customer for giving us the opportunity and reposing trust in us. This is the fun-part of being a test boutique!

Can we guarantee cleanliness of a software?

Couple of weeks ago, I conducted a Thought Leadership workshop on an absolutely new topic “Hypothesis-based testing”. It was conducted in Bangalore at ”The Chancery Hotel” and was well received. Subsequently, I did a 90 minute lecture in Ford Chennai to the QA audience and the feedback that I received was “All the participants thoroughly enjoyed this thought provoking talk to engineer clean software. It was more value adding and was different from the other normal monthly talks”.

So what is “Hypothesis-based testing” and how is it connected to STEM™ 2.0? Let us start from the fundamentals – “Can we guarantee that the released software is clean?” Most often the answer is an emphatic NO! The reason for this is that we do not wish to believe that the final software will have no defects. Hey, wait a minute…. Is this what guarantee means? Not really. As a consumer, we expect the products that we buy to be guaranteed. That is, we expect that it has been well validated, and should ensure a great experience when we use it. In the extreme case when it does fail, we expect fast turnaround via a responsive support system, resulting in a quick fix or replacement. Our customers expect the same too from us as producers of software. Hence guarantee means that we are extremely clear of the probability of the failure and the risk therefore. Once we are sure of this, we should be able to guarantee the cleanliness of software.

What does it take to do this?  If we can guarantee the process of evaluation of software, then we can guarantee that cleanliness of the delivered software. Now what does it take to do this? How can one guarantee the process of evaluation? Most often, we like to believe that experience of the test staff is what will make this possible. But this is simply not enough.  If we can scientifically analyze the process of evaluation, rationally justify why we did the various activities and the outcomes, and yet do not find any anomalies, then we can inclined to think that the process of evaluation is guaranteed to produce clean software. Hence guarantee can be achieved if we can explain the means to achieve the outcome. For example if we can scientifically explain our test strategy, scientifically prove the completeness of test cases etc, we can indeed guarantee cleanliness. The trouble is that the industry does not seem to have rational and scientific form of thinking/method.

STEM™ 2.0 is a method that allows for scientific thinking and disciplined implementation to ensure that the delivered software can indeed be guaranteed to be clean. The central core of STEM™ is establishing a clear goal and then performing activities that will indeed get us to the goal. In STEM™, the goal translates to “Uncover these potential defects”. These potential defects are hypothesized and all the later activities are about proving that the hypothesized defect(s) do/do-not exist.

Hypothesis-based testing is therefore an approach to testing that is about establishing a hypothesis and proving/disproving them. In Hypothesis-based testing, the key tenet is to establish a hypothesis that these potential defects may exist, and then proving/disproving their existence. Hypothesis-based testing is powered by STEM™ 2.0! I hope you see the connection now!

Think – when you go the doctor with ailments, he/she hypothesizes potential problems based on your symptoms, performs diagnostic tests to confirm the hypothesis and then prescribes the treatment regimen. When you go on a holiday, you hypothesize needs and accordingly pack items. Whenever you have a goal-focused, you always hypothesize the future situation(s) and then up with a list of activities. Hypothesis-based testing is the application of this thinking to uncovering defects in software. Hypothesis-based testing is not a normal term, it is invented in STAG! Hypothesis based testing is powered by STEM™ 2.0.