A widely cited study for the National Institute of Standards & Technology (NIST) reports that inadequate testing methods and tools annually cost the U.S. economy between $22.2 and $59.5 billion, with roughly half of these costs borne by software developers in the form of extra testing and half by software users in the form of failure avoidance and mitigation efforts. The same study notes that between 25 and 90 percent of software development budgets are often spent on testing. This posting, the first in a two-part series, highlights results of an analysis that documents problems that commonly occur during testing. Specifically, this series of posts identifies and describes 77 testing problems organized into 14 categories, lists potential symptoms by which each can be recognized, potential negative consequences, potential causes, and makes recommendations for preventing them or mitigating their effects.
Common Problems with Testing
Despite the huge investment in testing mentioned above, recent data from Capers Jones shows that the different types of testing are relatively ineffective. In particular, testing typically only identifies from one-fourth to one-half of defects, while other verification methods, such as inspections, are typically more effective s. Inadequate testing is one of the main reasons why software is typically delivered with approximately 2 to 7 defects per thousand lines of code (KLOC). While this may seem like a negligible amount, the result is that major software-reliant systems are being delivered and placed into operation with hundreds or even thousands of residual defects. If software vulnerabilities (such as the CWE/SAN Top 25 Most Dangerous Software Errors) are counted as security defects, the rates are even more troubling.
Clearly, there are major problems with the efficiency and effectiveness of testing as it is currently performed in practice. In the course of three decades of developing systems and software--as well my involvement in numerous independent technical assessments of development projects--I have identified and analyzed testing-related problems that other engineers, managers, and I have observed to commonly occur during testing. I also solicited feedback from various LinkedIn groups (such as Bug Free: Discussions in Software Testing, Software Testing and Quality Assurance) and the International Council on Systems Engineering (INCOSE). As of March 2013, I have received and incorporated feedback from 29 reviewers in 10 countries. While the resulting framework of problems can apply to both software and systems testing, it emphasizes software because that is where the most of the testing problems occur.
The large number of testing problems necessitated that they be categorized. At the top level, these problems were organized into two groups
general testing problems that are not specific to any type of testing, but apply to all different types of testing
The remainder of this post will focus on general testing problems, which can be divided into eight categories:
Test planning and scheduling problems often occur when there is no separate test plan, but rather highly incomplete and superficial summaries in other planning documents. Test plans are often ignored once they are written, and test case descriptions are often mistaken for overall test plans. The schedule of testing is often inadequate for the amount of testing that should be performed, especially when testing is primarily manual. Significant testing is often postponed until too late in the development process, especially on projects using traditional sequential development cycles.
Stakeholder involvement and commitment problems include having the wrong testing mindset (that the purpose of testing is to show that the software works instead of finding defects), having unrealistic testing expectations (that testing will find all of the significant defects), and having stakeholders who are inadequate committed to and supporting of the testing effort.
Management-related testing problems involve the impact of inadequate management. For example, management can fail to supply adequate test resources or place inappropriate external pressures one testing. There may be inadequate test-related risk management or test metrics. Testing lessons learned are far too often ignored, so the same problems are repeated project after project.
Test organizational and professionalism problems include a lack of independence, unclear testing responsibilities, and inadequate testing expertise.
Test process problems often occur when testing and engineering processes are poorly integrated. Organizations sometimes take a "one-size-fits-all" approach taken to testing, regardless of the specific needs of the project. Testing may not be adequately prioritized so that functional testing, black-box system testing, or white-box unit and integration testing may be overemphasized. Testing of components, subsystems, or the system may begin before they are sufficiently mature for testing. Other problems include inadequate test evaluations and inadequate test maintenance.
Test tools and environments problems include an over-reliance on manual testing or COTS testing tools. Often, there are an insufficient number of test environments. Some of the test environments may also have poor quality (excessive defects) or insufficient fidelity to the actual system being tested. Moreover, the system and software under test may behave differently during testing than during operation. Other common problems are that tests were not delivered or the test software, test data, and test environments were not under sufficient configuration control.
Test communication problems primarily involve inadequate test documentation. These types of problems often occur when test documents are not maintained or inadequate communication concerning testing is taking place.
Requirements-related testing problems are related to the requirements that should be driving testing. Often, the requirements are ambiguous, missing, incomplete, incorrect, or unstable. Lower-level requirements may be improperly derived from their higher-level sources. Likewise, verification methods may be unspecified and the tracing between requirements and tests may be lacking.
Exacerbating these problems is the fact that too often research focuses on the defects identified through testing, but does not address problems that exist in an organization's planned testing process or its implementation of that process. Not surprisingly, a 2010 survey found that 58 percent of respondents blamed the testing process and infrastructure for their most recently identified major software defects.
Addressing General Testing Problems
There are clearly many problems with the way software and software-reliant systems are tested, as discussed above. Moreover, these general testing problems are not getting significantly better, despite greater attention to test-driven development and continuous integration in the commercial software industry, which only address a few of the identified testing problems. While few projects will experience all of these problems, many projects exhibit several of them. Similarly, while these testing problems do not guarantee the software will contain an excessive number of residual defects, these problems definitely pose serious risks that must be managed.
For each testing problem described above, therefore, I have documented several types of information useful for understanding the problem and implementing a solution. As an example of the results of our analysis, the testing problem "Wrong Testing Mindset" has been documented with the following information
Description: Some of the testers and other testing stakeholders have the wrong testing mindset.
Potential symptoms: Some testers and other testing stakeholders assume the system/software works. Testers assume or are told that their job is to verify or "prove" that the system/software works. Testing is being used to demonstrate that the system/software works properly rather than to determine where and how it fails. Only normal ("sunny day" or "happy path") behavior is being tested. There is little or no testing of exceptional or fault/failure tolerant ("rainy day") behavior. There is no testing of input data (e.g., range testing of the handling of invalid input values). Test input includes only middle-of-the-road values rather than boundary values and corner cases.
Potential consequences: A high probability exists that the delivered system or software will contain significant residual defects related to abnormal behavior (e.g., exceptional use case paths), and these defects will unnecessarily reduce its reliability and robustness (e.g., error, fault, and failure tolerance). Customer representatives, managers, and developers will obtain a false sense of security that the system functions properly.
Potential causes: Testers were taught or explicitly told that their job is to verify or "prove" the system/software works. Developers typically conduct their own unit-level (i.e., lowest level) testing. With small, cross functional (e.g., agile) teams, it is becoming more common for developers to also conduct integration and subsystem testing. This scenario presents a "conflict of interest" for developers who are asked to build software that works and then attempt to show their software does not work. This problem is especially prevalent with small, cross-functional development organizations/teams that "cannot afford" independent, specially trained testers. There was insufficient schedule allocated for testing so there is only sufficient time to test normal behavior (e.g., use case paths). The organizational culture is "success oriented," therefore looking "too hard" for problems is implicitly discouraged. Management gave the testers the strong impression that they do not want to hear any bad news (i.e., that there are any significant defects being found in the system).
Recommendations: Explicitly state in the project test plan that the primary goal of testing is to find defects by causing the system to fail (i.e., to break the system) rather than to demonstrate that there are no defects (i.e., to show that it works). Provide test training that emphasizes the proper testing mindset. In addition to test cases that verify all nominal behavior, emphasize looking for defects where they are most likely to hide (e.g., boundary values and corner cases).
Related Problems: Inappropriate external pressures (e.g., managers and other stakeholders pressure testers to cut corners, not officially report defects, lower defect severity, or not find defects) and inadequate communication concerning testing (i.e., inadequate verbal and written communication occurs between testers and stakeholders including managers, requirements engineers, developers, and acquirers).
Benefits of Using the Catalog of Common Testing Problems
This analysis of commonly occurring testing problems--and recommended solutions--can be used as training materials to better learn how to avoid, identify, and understand testing problems and mitigate them. Like anti-patterns, these problem categories can be used to improve communication between testers and testing stakeholders. This list can also be used to categorize problem types for metrics collection. Finally, they can be used as a checklist when
producing test plans and related documentations
evaluating contractor proposals
evaluating test plans and related documentation (quality control)
evaluating as-performed test process (quality assurance)
identifying test-related risks and their mitigation approaches
The next post in this series will explore test type-specific problems, as well as future work including using an industry survey to determine the problems that are most common and which problems cause the most trouble.
By David Svoboda Senior Member of the Technical Staff CERT Division
Whether Java is more secure than C is a simple question to ask, but a hard question to answer well. When we began writing the SEI CERT Oracle Coding Standard for Java, we thought that Java would require fewer secure coding rules than the SEI CERT C Coding Standard because Java was designed with security in mind. We naively assumed that a more secure language would need fewer rules than a less secure one. However, Java has 168 coding rules compared to just 116 for C. Why? Was our (admittedly simplistic) assumption completely spurious? Or, are there problems with our C or Java rules? Or, are Java programs, on average, just as susceptible to vulnerabilities as C programs? In this post, I attempt to analyze our CERT rules for both C and Java to determine if they indeed refute the conventional wisdom that Java is more secure than C.