search menu icon-carat-right cmu-wordmark

The Top 10 Blog Posts of 2015: Technical Debt, DevOps, Graph Analytics, Secure Coding, and Testing

Douglas C. Schmidt

In 2015, the SEI blog launched a redesigned platform to make browsing easier, and our content areas more accessible and easier to navigate. The SEI Blog audience also continued to grow with an ever-increasing number of visitors learning more about our research in technical debt, shift-left testing, graph analytics, DevOps, secure coding, and malware analysis. In 2015 (from January 1 through December 15), the SEI blog logged 159,604 visits and sessions (we also switched analytics platforms mid-year), a 26 percent increase in traffic from the previous year. This blog post highlights the top 10 posts published in 2015. As we did with our mid-year review, we will include links to additional related resources that readers might find of interest. We also will present the posts in descending order beginning with the 10th most popular post of 2015 and counting down to number one.

10. Ten Recommended Practices for Achieving Agile at Scale
9. Open System Architectures: When and Where to Be Closed 8. A Taxonomy of Testing 7. Is Java More Secure Than C? 6. Managing Software Complexity in Models 5. The Pharos Framework: Binary Static Analysis of Object Oriented Code
4. Developing a Software Library for Graph Analytics 3. Four Types of Shift-Left Testing 2. DevOps Technologies: Fabric or Ansible 1. A Field Study of Technical Debt

10. Ten Recommended Practices for Achieving Agile at Scale

Software and acquisition professionals often have questions about recommended practices related to modern software development methods, techniques, and tools, such as how to apply agile methods in government acquisition frameworks, systematic verification and validation of safety-critical systems, and operational risk management. In the DoD these techniques are just a few of the options available to face the myriad challenges in producing large, secure software-reliant systems on schedule and within budget.

In an effort to offer our assessment of recommended techniques in these areas, SEI built upon an existing collaborative online environment known as SPRUCE (Systems and Software Producibility Collaboration Environment), hosted on the Cyber Security & Information Systems Information Analysis Center (CSIAC) website. From June 2013 to June 2014, the SEI assembled guidance on a variety of topics based on relevance, maturity of the practices described, and the timeliness with respect to current events. Those recommended practices are in the process of being published on the SEI Blog. The post, 10 Recommended Practices for Achieving Agile at Scale, is the 10th most popular post for 2015.

Here is an excerpt:

Make team coordination top priority.

Scrum is the most common Agile project management method used today and primarily involves team management practices. In its simplest instantiation, a Scrum development environment consists of a single Scrum team with the skills, authority, and knowledge required to specify requirements, architect, design, code, and test the system. As systems grow in size and complexity, the single team mode may no longer meet development demands. If a project has already decided to use a Scrum-like project-management technique, the Scrum approach can be extended to managing multiple teams with a "Scrum of Scrums," a special coordination team whose role is to (1) define what information will flow between and among development teams (addressing inter-team dependencies and communication) and (2) identify, analyze, and resolve coordination issues and risks that have potentially broader consequences (e.g., for the project as a whole). A Scrum of Scrums typically consists of members from each team chosen to address end-to-end functionality or cross-cutting concerns such as user interface design, architecture, integration testing, and deployment. Creating a special team responsible for inter-team coordination helps ensure that the right information, including measurements, issues, and risks, is communicated between and among teams. Care needs to be taken, however, when the Scrum of Scrums team itself gets large to not overwhelm the team. This scaling can be accomplished by organizing teams--and the Scrum of Scrums team itself--along feature and service affinities. We further discuss this approach to organizing teams in our feature-based development and system decomposition practice. Such orchestration is essential to managing larger teams to success, including Agile teams.

To read the complete post, please click here.

Readers interested in learning more about the SEI's work in Agile-at-scale can view the following resources:

9. Open Systems Architectures: When and Where to Be Closed

Due to advances in hardware and software technologies, Department of Defense (DoD) systems today are highly capable and complex. However, they also face increasing scale, computation, and security challenges. Compounding these challenges, DoD systems were historically designed using stove-piped architectures that lock the Government into a small number of system integrators, each devising proprietary point solutions that are expensive to develop and sustain over the lifecycle. Although these stove-piped solutions have been problematic (and unsustainable) for years, the budget cuts occurring under sequestration are motivating the DoD to reinvigorate its focus on identifying alternative means to drive down costs, create more affordable acquisition choices, and improve acquisition program performance. A promising approach to meet these goals is Open Systems Architecture (OSA), which combines

This blog posting, also by Don Firesmith, expands on earlier coverage of how acquisition professionals and system integrators can apply OSA practices to effectively decompose large monolithic business and technical architectures into manageable and modular solutions that can integrate innovation more rapidly and lower total ownership costs.

Here is an excerpt:

Several steps required to produce an OSA are described below.

  • Step 1: Decide how to modularize the system. The main OSA modularization approach typically decomposes the system into separately procurable architectural components to support competition, minimize vendor lock, and enable technology refresh. This approach is used to identify the key architectural components of the system that should be open, while the other parts potentially may be closed or partially open. The interfaces of these key components are then defined as the key interfaces that should conform to open interface standards.
  • Step 2: Determine the associated open interface standards to use for key interfaces. When should an interface be identified as a key interface and, therefore, when should that interface be open? The answer is really a matter of cost-benefit analysis. While there are clearly benefits to being open, there are also associated costs and challenges. Mandating the use of an open interface standard may necessitate the replacement of an existing closed (e.g., proprietary) interface, training the developers to use the new interface standard, and lowered developer productivity until the new interface standard is mastered.
  • Step 3: Verify conformance to the open interface standards. As mentioned earlier, an open interface must be verified (via testing or static analysis) to conform, both syntactically and semantically, to the associated open interface standard. It is also important to document and verify any implementation-defined portions of the standard.

To read the complete post, please click here.

Readers interested in learning more about the SEI's work in open systems architecture can view the following resources:

  • Blog: To read the SEI Blog post A Discussion on Open Systems Architecture, by Carol Sledge and Douglas C. Schmidt please click here.
  • Session notes: To see presentation materials from a SATURN 2015 discussion Open Systems Architectures: Progress and Challenges, please click here.

8. A Taxonomy of Testing

While evaluating the test programs of numerous defense contractors, SEI researchers have often observed that they are quite incomplete. For example, they typically fail to address all the relevant types of testing that should be used to (1) uncover defects (2) provide evidence concerning the quality and maturity of the system or software under test, and (3) demonstrate the readiness of the system or software for acceptance and being placed into operation. Instead, many test programs only address a relatively small subset of the total number of potentially relevant types of testing, such as unit testing, integration testing, system testing, and acceptance testing. In some cases, the missing testing types are actually performed (to some extent) but not addressed in test-related planning documents, such as test strategies, system and software test plans (STPs), and the testing sections of systems engineering management plans (SEMPs) and software development plans (SDP). In many cases, however, they are neither mentioned nor performed. SEI researcher Donald Firesmith blogs regularly about software and systems testing, and his posts are always among the most popular on the site. In this blog post, the first in a series on the many types of testing, Firesmith examines the negative consequences of not addressing all relevant testing types and introduces a taxonomy of testing types to help testing stakeholders understand--rather than overlook--them.

Here is an excerpt:

This work began when it became clear just how many more types of testing existed than were commonly addressed in contractor test planning. While exploring ways to address this incomplete planning, I decided to make an initial list of testing types based on personal experience and a brief online search, including the examination of glossaries of various testing organizations, such as the International Software Testing Qualifications Board (ISTQB). I quickly realized that the software and system engineering communities were using roughly 200 test types during the development and operation of software-reliant systems, far too many to merely create an alphabetized list. Such a long list would be so large and complex that it would be overwhelming and thus of little use to testing's stakeholders. Instead, we needed a taxonomy of testing types to provide structure and to divide and conquer this complexity.

I initially created 12 general types of testing, which eventually grew to 16. One of my colleagues, however, pointed out that a better way would be to first organize them by the way they primarily answer one of the following standard sets of questions, commonly known as the 5Ws (who, what, when, where, and why), and 2Hs (how and how well). This in turn led to the development of the following questions, which formed the foundation of my taxonomy for organizing testing types:

What are we testing?

When are we testing?

Why are we testing?

Who is performing the testing?

Where is the testing taking place?

How are we testing?

How well are the objects-under-test functioning?

To read the complete post, please click here.

Readers interested in learning more about the SEI's testing research, can view the following resources:

7. Is Java More Secure Than C?

Whether Java is more secure than C is a simple question to ask, but a hard question to answer well. When researchers on the CERT Secure Coding Team began writing the SEI CERT Oracle Coding Standard for Java, they thought that Java would require fewer secure coding rules than the SEI CERT C Coding Standard because Java was designed with security in mind. The researchers naively assumed that a more secure language would need fewer rules than a less secure one. However, Java has 168 coding rules compared to just 116 for C. Why? Was the (admittedly simplistic) assumption completely spurious? Or, are there problems with our C or Java rules? Or, are Java programs, on average, just as susceptible to vulnerabilities as C programs? In the blog post Is Java More Secure Than C?, David Svoboda analyzes CERT rules for both C and Java to determine if they indeed refute the conventional wisdom that Java is more secure than C.

Here is an excerpt:

Memory Corruption. Memory corruption comprises the biggest category of high-severity rules in C. Java has no analogous rules because its type system prevents memory corruption, which includes vulnerabilities, such as buffer overflows, format-string vulnerabilities, and use-after-free errors. It has become harder in C programs, however, to exploit memory corruption due to the advent of memory-protection technologies, such as address space layout randomization (ASLR) and data execution prevention (DEP). ASLR randomizes the layout of the program and its associated data in memory. On a typical 32-bit Linux system, ASLR reduces the success rate of a code-execution attack by a factor of 65,536 (or 216), according to Shacham and colleagues (Shacham, 2004). ASLR can be defeated by learning the memory layout of a long-running program by means of some lesser exploit that reveals memory layout. ASLR also requires support from the program, all associated libraries, and the operating platform.

DEP partitions memory into writable memory (that contains data) and executable memory (that contains code) and forbids executing memory that is also writable. Consequently, DEP thwarts simple exploit techniques but can still be subject to more advanced techniques, such as
return-oriented programming (Shacham 2007). Both ASLR and DEP are supported by major desktop and mobile operating systems. They may not be available, however, on embedded platforms, which also support C programs. Because these technologies are neither perfect nor universally available, we continue to promote adherence to the CERT C rules associated with memory corruption.

To read the complete post, please click here.

Readers interested in finding out more about the CERT secure coding research can visit the following resources:

  • Website: For more information about the work of the CERT Secure Coding Team, please visit
  • Paper: To read the paper On the Effectiveness of Address-Space Randomization by Shacham, Hovav, Page, Matthew, Pfaff, Ben, Goh, Eu-Jin, Modadugu, Nagendra, and Boneh, Dan, please click here.
  • Paper: To read the paper, The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86) by Shacham, Hovav, please click here.
  • Standard: To view the standard for (C11]) ISO/IEC. Programming Languages--C, 3rd ed , please click here.

6. Managing Software Complexity in Models

For decades, safety-critical systems have become more software intensive in every domain--in avionics, aerospace, automobiles, and medicine. Software acquisition is now one of the biggest production costs for safety-critical systems. These systems are made up of several software and hardware components, executed on different components, and interconnected using various buses and protocols. For instance, cars are now equipped with more than 70 electronic control units (ECUs) interconnected with different buses and require about 100 million source lines of code (SLOC) to provide driver assistance, entertainment systems, and all necessary safety features, etc. In the blog post Managing Software Complexity in Models, SEI researcher Julien Delange, a frequent blogger about the SEI's Architecture Analysis & Design Language (AADL), discusses the impact of complexity in software models and presents a tool that produces complexity metrics from software models.

Here is an excerpt:

This increasing use of software brings new challenges for safety-critical systems. In particular, having so many interconnected components makes system design, implementation, testing, and validation more difficult. Because the functions of safety-critical systems operate at different criticality levels (as the Design Assurance Level of DO178-C), system designers must ensure that low-criticality functions do not affect those at a higher criticality (e.g., so that the entertainment system of your car does not impact the cruise control or brakes). Despite rigorous development methods, safety-critical systems still experience safety and security issues: in July 2015, almost 2 million cars were recalled because of software issues (1.4 million for Chrysler and more than 400,000 for Ford).

Software complexity has a huge impact on software acquisition costs, for both development and maintenance activities. Some studies suggest that software complexity can increase maintenance cost by 25 percent, which is significant since maintenance operations represent about 70 percent of the total acquisition costs). Thus, on a $4.5 million project, managing software complexity could help save at least $750,000 on maintenance activities, or about 16.6 percent of the total lifecycle costs (which is a fairly low estimate, especially for safety-critical systems, where maintenance requires more testing and other verification methods than in any other domain). For this reason, being able to measure and keep software complexity under control is of paramount importance.

To read the complete post, please click here.

Additional Resource

5. The Pharos Framework: Binary Static Analysis of Object Oriented Code

Object-oriented programs present considerable challenges to reverse engineers. For example, C++ classes are high-level structures that lead to complex arrangements of assembly instructions when compiled. These complexities are exacerbated for malware analysts because malware rarely has source code available; thus, analysts must grapple with sophisticated data structures exclusively at the machine code level. As more and more object-oriented malware is written in C++, analysts are increasingly faced with the challenges of reverse engineering C++ data structures. In August, CERT researcher Jeff Gennari published the first blog post, The Pharos Framework Binary Static Analysis of Object Oriented Code in a series that discusses tools developed by the Software Engineering Institute's CERT Division to support reverse engineering and malware analysis tasks on object-oriented C++ programs.

Here is an excerpt:

ObjDigger uses definition-use analysis to identify object pointers, known as this pointers. The analysis process works as follows:

    1. First ObjDigger uses ROSE to gather a list of functions in the executable file.
    2. ObjDigger analyzes each function to determine if it is a class method based on whether it follows the __thiscall calling convention. In __thiscall functions the this pointer for an object is passed in as an argument in the ECX register.
    3. ObjDigger detects the this pointer passed in to the function by identifying reads of the ECX register without initialization.
    4. Once the set of __thiscall functions is identified, further analysis of this pointer usage in the body of each function is performed to identify possible class members and methods.

The Pharos binary analysis infrastructure provides information on which program instructions influence (i.e., read and write) computations on subsequent instructions. This abstract interpretation of instructions makes it possible to track values through an assembly listing. Reasoning about abstract values as they are accessed through a program enables identification of object-oriented constructs. For example, a call to a virtual function requires two pointer dereferences:

    • one to access the virtual function table
    • one to access the appropriate virtual function

In Code Listing 4 dereferencing the virtual function table pointer and fetching the correct virtual function corresponds to the pointer accesses at addresses 0x0040103A and 0x0040103F, respectively. For each indirect call found in the binary (i.e., a call on a register or memory address), ObjDigger searches for two previous dereferences connected through common pointers (i.e., a pointer that refers to another pointer that refers to a known class virtual function). That is, if the call instruction is preceded by two pointer dereferences and these pointers trace back to a known class structure with a virtual function table that contains a valid class method, then this arrangement is labeled as a virtual function call and bound to the class structure. The target of the call is determined by examining the virtual function table for the known class structure.

ObjDigger uses similar data flow analysis to identify class members and methods and class relationships. A more thorough, if slightly dated, discussion of the ObjDigger's data structure recovery algorithms is available in our paper titled Recovering C++ Objects From Binaries Using Inter-Procedural Data-Flow Analysis that was published at the ACM SIGPLAN on Program Protection and Reverse Engineering Workshop in 2014.

To read the complete post, please click here.

Readers interested in finding out more about CERT's research in malware analysis can view the following resources:

  • Download: CERT recently released ObjDigger publicly, and those who are interested in evaluating ObjDigger can download it from the Pharos Static Analysis Tools site.
  • Repository: We have also created a GitHub repository for Pharos, and plan to release selected components of our framework for inclusion back into the ROSE infrastructure.

4. Developing a Software Library for Graph Analytics

Graph algorithms are in wide use in Department of Defense (DoD) software applications, including intelligence analysis, autonomous systems, cyber intelligence and security, and logistics optimizations. In late 2013, several luminaries from the graph analytics community released a position paper calling for an open effort, now referred to as GraphBLAS, to define a standard for graph algorithms in terms of linear algebraic operations. BLAS stands for Basic Linear Algebra Subprograms and is a common library specification used in scientific computation. The authors of the position paper propose extending the National Institute of Standards and Technology's Sparse Basic Linear Algebra Subprograms (spBLAS) library to perform graph computations. The position paper served as the latest catalyst for the ongoing research by the SEI's Emerging Technology Center in the field of graph algorithms and heterogeneous high-performance computing (HHPC). Our third most popular post of 2015, the second in a series highlighting the work of ETC researchers to create a software library of graph algorithms for heterogeneous architectures that will be released via open source.

Here is an excerpt:

In computer science, the term embarrassingly parallel problem describes a situation where the same operation or set of operations can be executed on different data simultaneously, thereby allowing the distribution of data across many computing elements without the need for communication (and/or synchronization) between the elements. The problems are relatively easy to implement on high-performance systems and can achieve excellent computing performance. High-performance computing (HPC) is now central to the federal government and many industry projects, as evidenced by the shift from single-core and multi-core (homogenous) central processing units (CPUs) to many-core and heterogeneous systems, including graphics processing units (GPUs) that are adept at solving embarrassingly parallel problems.

Unfortunately, many important problems are not embarrassingly parallel including graph algorithms. Fundamentally, graphs are data structures with neighboring nodes connected by edges. The computation to be performed on graphs often involves finding and ranking important nodes or edges, finding anomalous connection patterns, identifying tightly knit communities of nodes, etc. The irregular structure of the graphs makes the communication to computation ratio high for these algorithms--the opposite of the ratio found in embarrassingly parallel problems--and thus extremely hard to develop implementations that achieve good performance on HPC systems.

We are targeting GPUs for our research not only because of their prevalence in current HPC installations (e.g., for simulating three-dimensional physics), but also because of their potential for providing an energy-efficient approach to the computations. We are investigating different approaches, including the linear algebra approach offered by the GraphBLAS effort, to enable the efficient use of GPUs and pave the way for easier development of high-performance graph algorithms.

To read the complete post, please click here.

Readers interested in finding out more about the SEI's research in graph algorithms can view the following resources:

3. Four Types of Shift-Left Testing

One of the most important and widely discussed trends within the software testing community is shift left testing, which simply means beginning testing as early as practical in the lifecycle. What is less widely known, both inside and outside the testing community, is that testers can employ four fundamentally-different approaches to shift testing to the left. Unfortunately, different people commonly use the generic term shift left to mean different approaches, which can lead to serious misunderstandings. In the blog post Four Types of Shift-Left Testing, Donald Firesmith explains the importance of shift left testing and defines each of these four approaches using variants of the classic V model to illustrate them.

Here is an excerpt:

For decades, it has been well known that defects are more difficult and expensive to fix the later they are found in the lifecycle. This phenomenon is one reason that treating testing as a sequential phase at the end of waterfall development has long been viewed as a major pitfall of system and software testing. Examples of the harm caused by postponing testing include

  • Testers may be less involved in initial planning, often resulting in insufficient resources being allocated to testing.
  • Many requirements, architecture, and design defects are not uncovered and fixed until after significant effort has been wasted on their implementation.
  • Debugging (including identifying, localizing, fixing, and regression testing defects) becomes harder as more software is produced and integrated.
  • Encapsulation makes it harder to perform whitebox testing and to achieve high levels of code coverage during testing.
  • There is less time to fix defects found by testing, thereby increasing the likelihood that they will be postponed until later increments or versions of the system, which creates a "bow wave" of technical debt that can sink projects if it grows too large.

These negative consequences of late testing increase development and maintenance costs, lead to missed deadlines and schedule delays, decrease quality due to residual defects, and generally lower project morale and job satisfaction.

To read the complete post, please click here.

Readers interested in finding out more about the SEI's work in testing, can view the following resources:

2. DevOps Technologies: Fabric or Ansible

In late 2014, the SEI Blog launched a new series of blog posts that offered technical guidelines and practical advice for DevOps. In early 2015, as part of that series, Tim Palko published the blog post DevOps Technologies: Fabric or Ansible, which examined the infrastructure-as-code solutions Fabric and Ansible for configuring a remote server.

Here is an excerpt:

The workflow of deploying code is almost as old as code itself. There are many use cases associated with the deployment process, including evaluating resource requirements, designing a production system, provisioning and configuring production servers, and pushing code to name a few. In this blog post I focus on a use case for configuring a remote server with the packages and software necessary to execute your code.

This use case is supported by many different and competing technologies, such as Chef, Puppet, Fabric, Ansible, Salt, and Foreman, which are just a few of which you are likely to have heard on the path to automation in DevOps. All these technologies have free offerings, leave you with scripts to commit to your repository, and get the job done. This post explores Fabric and Ansible in more depth. To learn more about other infrastructure-as-code solutions, check out Joe Yankel's blog post on Docker or my post on Vagrant.

One difference between Fabric and Ansible is that while Fabric will get you results in minutes, Ansible requires a bit more effort to understand. Ansible is generally much more powerful since it provides much deeper and more complex semantics for modeling multi-tier infrastructure, such as those with arrays of web and database hosts. From an operator's perspective, Fabric has a more literal and basic API and uses Python for authoring, while Ansible consumes YAML and provides a richness in its behavior (which I discuss later in this post). We'll walk through examples of both in this posting.

Both Fabric and Ansible employ secure shell (SSH) to do their job in most cases. While Fabric leverages execution of simple command-line statements to target machines over SSH, Ansible pushes modules to remote machines and then executes these modules remotely, similar to Chef. Both tools wrap these commands with semantics for basic tasks such as copying files, restarting servers, and installing packages. The biggest difference between them is in the features and complexity that is presented to the operator.

To read the complete post, please click here.

Readers interested in finding out more about the SEI's research in DevOps can view the following resources:

  • Devops blog posts: To view other DevOps blog posts, please click here.
  • Webinar: To view the SEI Webinar DevOps Panel Discussion, please click here
  • Webinar: To view the SEI Webinar Culture Shock: Unlocking DevOps with Collaboration and Communication, please click here.

1. A Field Study of Technical Debt

Neil Ernst's post, A Field Study of Technical Debt, which was published in July and the most popular post on the SEI Blog in 2015, brought in nearly 20,000 visits the month it was published and generated a spirited discussion of technical debt issues on Hacker News. In the post, Ernst highlights the findings of a recent field study to assess the state of the practice and current thinking regarding technical debt and guide the development of a technical debt timeline.

Here is an excerpt:

One question our work in this field has raised is whether there are practices that move this metaphor beyond a mere communication mechanism. The metaphor is attractive to practitioners because it communicates the idea that if quality problems are not addressed, things may get worse. Is there more to it than that?

Existing studies of technical debt have largely focused on code metrics and small surveys of developers. Practitioners currently broadly define technical debt as a "shortcut for expediency" and more specifically, bad code or inadequate refactoring. The initial definition, from Ward Cunningham, referred to the debt incurred because "first-time code" would ship with a limited understanding of the true nature of the problem. But is there more to technical debt than bad code?

To read the complete post, please click here. To read the discussion thread generated by this post on Hacker News, please click here.

Readers interested in finding out more about the SEI's research in technical debt can view the following resources:

  • Survey materials: You can access survey materials, including the questions, here. We are interested in continuing the research, so if you would like to collaborate on a similar survey, please get in touch.
  • Technical debt research: This survey is part of a wider SEI effort on technical debt, including an ongoing research effort.
  • Technical debt blog posts: For other posts in our technical debt series, please click here.

Wrapping Up 2015 and Looking Ahead

This has been a great year for the SEI Blog. We are looking forward to weekly posts highlighting the work of SEI and CERT researchers in 2016 and beyond. Some highlights to look forward to include:

  • Jay McAllister of the SEI's Emerging Technology Center will be highlighting the latest work in cyber intelligence.
  • Julia Allen and Nader Mehravari of the SEI's CERT Division will be publishing a post highlighting their work on structuring the chief information security officer organization.
  • CERT researcher Sam Weber will be publishing a post highlighting his research on application programming interfaces and security.

As always, we welcome your ideas for future posts and your feedback on those already published. Please leave feedback in the comments section below.

Additional Resources

Download the latest publications from SEI researchers at our digital library


This post has been shared 1 times.

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed