Software Testing: 2009

Saturday, December 5, 2009

Why should we test our software?

This article is not only for testers, but also managers interested in what software testing could bring for them and perhaps does not bring now. Therefore I start with definitions of two terms, which I often use here:

Quality Assurance - planned and systematic processes designed to ensure the suitability of the product for its intended purpose.

Testing - the process of collecting and sorting information obtained through the examination of the product, a part of more general quality assurance.

If we feel that the testing and quality assurance is not given enough attention, and this side of the software development is constantly underestimated, perhaps we are not able to convince others of the importance of testing and its expert management.

So what are the benefits of software testing and quality assurance? Which items should we mention in the presentation of its benefits?

Quality assurance process affects three managerially important areas:
- Marketing
- Risk management
- Reducing costs

Marketing

The fact that a good product delivered to a customer has a positive impact on reputation of company, while highly unreliable software destroys its reputation, is a simple logical conclusion. It is harder to predict or just realize a level of this influence. Poor product influences reputation in waves, some of them are immediate and fast disappearing, some has negative impact on business for several years.

The first wave is a response of direct customer, who makes decisions about subsequent cooperation based on his contentment.

The second wave is the reaction of end customers. They can cause a number of inconveniences in the case of bad reception of a product. Their constant complaints and bug reporting can cause project to go unexpectedly overprice. So that initially profitable project can become unprofitable project. Furthermore, their negative reception of a product may cause a change in the view of management on the supplier or get among the general public through newspaper and television news.

The third wave is a reaction of current employees and others in the field of software development. Reputations of software companies spread further due to constant staff turnover and persist for many years. Competent employees leave firms that are not able to deliver software in required quality because they do not want to be associated with next failure, or because they are simply not motivated to improve there. Where is no objective criteria of quality there is always a lack of efforts to improve. For the same reasons, good potential employees avoid the firm. They heard negative things about it from former or current employees.

The fourth wave, which affects a software company with the largest delay, is the reluctance of other customers or other software firms to cooperate with it. It is not possible to conceal a lack of efforts to ensure quality and poor processes from employees. If these employees are convinced that they themselves would never give their firm a contract for software development, then they would be less willing to give this company a contract even after several years when they will make such decisions from a position of managers and directors. The same goes for anyone who learned about poorly designed processes from their colleagues and friends.

In these four waves, each project influences according to its quality and impact in a positive or negative direction the company's business.

Risk management

Testing is a tool to obtain objective information on the status of developed software. This information is the most important input for risk management of software development. Testing is intertwined through the entire process of development. From the very beginning, testing controls compliance of the development with customer's needs, clarity and logic of outputs. It prevents misunderstandings and unnecessary waste of resources by timely informing about the shortcomings and errors. Quality assurance moreover define procedures and monitors developments in terms of providing simplicity, clarity, accuracy, precision, speed and other quality standards. As a result there is both a significant reduction in the likelihood of quality problems during and after development, and a limitation of an impact when a problem occurs.

Testing significantly prevents problems with malicious bugs. A single malicious bug in financial, medical or other critical sector can impact a loss that exceeds the entire budget for the development of this software.

Without testing, there are only two unpleasant ways to reduce the risk associated with software development:
- To find the subcontractor that takes a responsibility to some extent
- If it is possible, to insure against certain types of problems

Reducing costs

Although one of the most common management mistakes is sacrificing the quality during cost reduction, it is just effective quality assurance that brings the greatest savings. From the economic point of view, testing should last as long as the estimated average cost of finding and correcting a bug discovered in the next test cycle is less than the average cost of a bug discovered by the customer multiplied by the probability of its discovery. Simply, testing should last as long as it is financially more profitable than not testing. Quality assurance is an investment, so it is useful to monitor its return.

The problem is that it is very difficult to manage quality management with maximum efficiency and minimum cost. Such a task requires experience, feeling and excellent knowledge of a professional. Therefore, when you assure quality, you need to do well.

How to start?

Set effective testing process is not a one time thing, but it is created by constant tuning based on different reasonably chosen metrics which warn unless everything is alright and where you can monitor the deterioration or improve efficiency.

Understanding the key role of quality management in software development, setting standards, quality control process and having excellent professionals with experience directly in testing is a good basis for any quality assurance department.

Monday, October 5, 2009

Problems with code coverage

One important testing technique is to monitor how the application is covered by tests. We track how many use cases we have covered, how many customer requests, but the most accurate at least in terms of functionality is code coverage. In determining the code coverage it is recorded what code was run during tests and what has not yet been tested.

Simplest but inadequate and misleading is statement coverage – command coverage.

It only checks whether the command line was executed during testing, to cover conditions it is enough to execute it with any of its evaluation.

Formally: The test set T satisfies the criterion of coverage of commands for a given code K if for every command p belonging to the code K there is a test t from the set T, that during execution of test t will be executed command p.

For tester or programmer is not a problem to achieve very high (even 100%) command coverage of a very buggy program without a single failed test, without a single mistake being discovered.
In spite of this metric is misleading, it is unfortunately popular for its simplicity.

More advanced metric is decision coverage (also branch coverage).

Formally: Consider a graph, where the statements are nodes and transitions between the edges. Then consecutive statements are edge and condition is node of which lead two edges, one for true and one for false value. Then the test set T satisfies the decision coverage criterion for the code K if for each edge h of the mentioned graph, there is a test t from test set T that during execution of test t the run goes through edge h.

Simply put, with decision coverage each condition is node with two edges leading from it.

For example:

We have code with three conditions, one of which is nested.

Then decision coverage graph looks like this:

Blue circles are conditions, green ones are statements.

Even here we can achieve full coverage without the discovery of many errors.

The next stage is condition coverage.

Formally: A set of tests T satisfies the condition coverage criterion for the code K if it satisfies the decision coverage criterion and for each part p of each composite condition, there are tests t and u from the set T that during execution of t p is evaluated as true and during execution of u as false.

If you skip the part where condition coverage has to satisfy the decision coverage criterion, then the case of being not covered by decisions but being covered by conditions occurs when for all possible inputs a condition is evaluated always as positive or negative.

Improvement from decision coverage is that condition coverage takes into account all possible evaluations of conditions, not only if it is true or not.

For example:
Let's have complex condition if (a>0 || b>0).
It takes these two tests to reach decision coverage: {a = 5; b = -2}, {a = 0; b = 0}
These tests do not reach condition coverage because b > 0 is false in both tests.
We need two different tests to reach condition coverage: {a = 5; b = 2}, {a = 0; b = 0}

In this version it is more difficult to find a bug that would not be revealed if code has full condition coverage, although certain types of bugs will still remain undetected even in fully covered code.

The highest level of coverage is path coverage. It finds out whether each of the possible paths in each function has been run and therefore it means a very detailed testing.

Formally: A set of tests T satisfies the path coverage criterion for the code K, if it satisfies condition coverage criterion and for each path C linking the input and output node in the graph of code and containing at most n cycles, there is a test t from the test set T that during execution of t the run goes through path C.

While you cannot detect all bugs only from the code, path coverage provides assurance that all options of the run have been tested.
The problem is the practical inapplicability of this coverage, since its complexity causes an exponential increase in the number of tests.

Try these exercises:

Question 1:
How many tests (runs of code) we need to have a method code_coverage covered by statements; decisions; conditions; paths?

Question 2:
How many tests (runs of code) we need to have a method code_coverage2 covered by statements; decisions; conditions; paths?

Monday, September 7, 2009

Murphy laws. Why are they true?

A program is good when it is bug free - which is impossible.

There are people who do not believe in the existence of bug-free software, and people who think that testing ensures perfection. (There are other groups of people, of which the worst are those that do not care for quality. But lets leave that aside.)
Bug-free programs exist, but only the trivial ones. For example, if the only thing the program should do is to write "Hello World!" on the screen and end.
The more programs are complicated, the harder it is to check if it is without errors.
A simple program of a few lines may have so many different paths in code that a person would not test all of them even in ten years.
Therefore, with any non-trivial program especially complex applications and systems man must reckon with the fact that no matter how much we test there always will be some bugs left.

Undetectable errors are infinite in variety, in contrast to detectable errors, which by definition are limited.

This law is merely the result of Dijkstr's axiom that testing is appropriate to prove the presence of errors, but inappropriate to prove their absence.

Every non-trivial program contains at least one bug.
Every non-trivial program can be simplified by at least one line of code.
The conclusion of the last two laws: Every non trivial program can be simplified to one line of code, and it will contain a bug.

There is almost nothing to add. Perhaps only that it is a self-fulfilling prophecy. The more a person makes changes in the code, the greater is the probability that he would carry a bug into it.

A working program is one that has only unobserved bugs.

This law would, in fact, read: the software no one complains about has only undiscovered bugs and that is because nobody uses it.
It is like that in reality. The aim of testing is to achieve a certain (preferably high) degree of quality, not to make sure the customer will not report any errors.

It is enough when the customer feels good about the product because he is not bother with bugs frequently and if he finds one it is nothing serious and it is quickly fixed.

The number of bugs always exceeds the number of lines found in a program.

This is a generalized observation. Often, this law is true, because during the development lifecycle will be so many changes in the program, that although the final program has x rows, programmers have written many times more.

The chances of a program doing what it's supposed to do is inversely proportional to the number of lines of code used to write it.

It is easy to understand. The more a program is complicated, the harder it is to understand and remember it and not make a bug.

No matter how many resources you have, it is never enough.

Non-trivial program can never be completely tested and we can not determine how many bugs have to be discovered yet. It is therefore possible to test indefinitely.
It is important to achieve the greatest effect with available resources. This is the true art of competent managers. If the discovery of bug and its repair cost more than costs of being discovered by (angry) customer, then it is time to stop testing.

A patch is a piece of software which replaces old bugs with new bugs.

Although the patch is relatively small piece of software, it needs to be tested. Rely on that everything will run as intended, without anyone really test it, this is a sure recipe for disaster.

Bugs will appear in one part of a working program when another 'unrelated' part is modified.

There are two reasons why even if the change is only one part of the program, there may suddenly arise bugs where none were previously.
The first reason is that these parts are in some way connected whit each other, but nobody realized it.
For example, one of them calls a function, which is unnecessary calling another function, which was abolished by this change.
The second reason is that although these two parts are not related, they use the same resources. This change may thus show a bug that was in the software for some time. This happens in case of poorly treated shared memory or a pointer that points to random data.

The subtlest bugs cause the greatest damage and problems.

Small and inconspicuous errors have either little or no effect (for example, the logo is shifted from a specified position) or very serious consequences (in one of hundred cases it will round up to the wrong amount, which may result in loss several times greater than the price of software). If a error is obvious, the user will notice it immediately and if its consequences are serious, it is now possible to remedy the situation and fix the bug. But if an error escapes attention, its impacts will accumulate until the avalanche break down and cause very serious problems.

Software bugs are impossible to detect by anybody except the end user.

Although testers reveal a large amount of various bugs, some bugs are difficult to detect for anyone than end user. This is because even if the tester tries to look at software from a user's perspective, he is not average user and lacks knowledge of user behaviour.
The only way to know reactions of users on software is seeing them work with it.

Any problem, no matter how complex, can be found by simple inspection.
Corollary: A nagging intruder with unsought advice will spot it immediately.

If the tester exhausted all ideas, and discovered all the bugs he could, it is necessary to modify approach or perhaps try a greater distance to find other bugs.

Walking on water and developing software to specification are easy as long as both are frozen.

Any change in specification brings risks into a development:
• Not everybody who need to know about a change will be told about it
• Change is unclear and poorly understood
• You must drop what was already tested and write a new part, which is potentially full of errors
• The complexity of what a developer must remember is increasing and it is easier to make a mistake
• ...

Law of Anti-security
The best way past a pesky security feature is a 13-year-old.

In an effort to secure software from unwelcome intruders, people sometimes tend to focus on the most common and best known ways to penetrate system and build sophisticated defences against them. Doing so they may overlook an indirect but simple way in. One could say that this law is the result when someone does not see the trees through the forest.

Tuesday, August 11, 2009

Basics of Testing

Because of literature about testing in czech language is very little, I have created a training manual mainly for junior testers. I posted one part of it about basics of testing in czech version of this page. And I wanted to give you a opportunity to read it too. But I am sorry I didn't have time to translate dozens of pages into english.
Nevertheless I decided to post it here in czech. May be some of you can find their names there. There are even citations in english original.
:)

Basics of Testing

Wednesday, August 5, 2009

Reflection on the testing mission

Manager should ...
... have personality of a leader
... clearly explain to other members of the team what is project mission and its objectives
... be able to inspire people for the task
... support the adoption of project objectives by employees
... promote cooperation and activity
... motivate to improve

This kind of approach I have encountered with in the lectures on the management school, in the wise books such as Measurements in Quality Management Systems by Jaroslav Nenadál, and unfortunately somewhat less frequently in practice.

The atmosphere of cooperation, enthusiasm and responsibility for achieving the objectives are certainly a plus in many areas, but in software testing are rather magical ingredients.

The definition of goals and efforts to achieve them is obviously the beginning of the journey, how to become from an average tester who only follows the scenarios and instructions for testing, to a real expert.

Test manager or project manager by defining test missions and inspiring testers for their achievement creates an environment that helps testers to improve and many of them will start to growth into outstanding professionals.

But it is a journey that will not end after a few hours or days. It is a journey in which you can go as long as you have motivation.
As long as you will want to improve, there always will be room for improvement.

The mission of testing determines what a current aim of testing is, and defines the criteria to assess whether and how the mission is fulfilled. So it is possible to better manage work tester, if he is said what goals the testing should achieve, what is the intention of testing.

The objectives of the test mission may be various: estimate the error rate of individual application modules, prepare reproducible tests, detect as many errors as possible, or just quickly detect errors which prevent the deployment of applications.

Frequently testing has multiple goals, and so mission has more objectives.

Purpose of defining mission is to understand intention of testing and the ability to assess results.

Have we fulfilled our mission? How? Why we failed? Is there room for improvement? What have we learned? Is it possible to achieve better results with another approach?

The definition of specific objectives of the testing iteration is a powerful tool. It enables efficient management of resources and teaches testers how to do their job better.

The most useful goals are measurable.

Two groups test the same product, how to decide which group is better?

If we do not know how good we are and how we meet our objectives, then there is lack of room for improvement and lack of motivation. Then the firm will have instead of professionals and effective testing only clever monkeys and not very good but expensive tests.