Managing Model Complexity
Over the years, software architects and developers have designed many methods and metrics to evaluate software complexity and its impact on quality attributes, such as maintainability, quality, and performance. Existing studies and experiences have shown that highly complex systems are harder to understand, maintain, and upgrade. Managing software complexity is therefore useful, especially for software that must be maintained for many years.
To generate the complexity metrics, tools extract applicable data--such as source lines of code, cohesion, coupling, and more--from binary or source code to analyze the software and report its complexity and quality. Several tools support these techniques and help stakeholders manage the evolution of system development, provide quality improvements, prevent lack of cohesion, and perform other tasks. To date, such approaches have been successfully used in many projects, but as system development moves toward model-based engineering, these methods, metrics, and tools might not be sufficient to manage model complexity. This blog post details the state of the art for reporting model complexity and introduces research underway at the SEI in this area.
Complexity and Model-Based Engineering
In the domain of embedded systems, projects are increasingly adopting model-based engineering tools, such as SCADE or Simulink, to specify and capture the functional architecture. Thanks to code generators, the bulk of these systems are no longer implemented manually, but instead are generated automatically from models. Code is then created from these abstract representations, with the result that code metrics no longer match engineering efforts. A simple change on a single model component can modify hundreds of lines of code, and modifying hundreds of model components might have little impact on the generated code. In fact, changes in the model are not always proportional to source code changes. For these reasons, code-analysis techniques (at the binary- or source-code level) cannot be used, and new methods must be developed to evaluate the quality and complexity of these auto-generated models.
These issues have been studied for several years, and interest in the topic continues to grow. Some work has focused on mapping existing source-code metrics (for example, Halstead or the cyclomatic complexity) approaches to models (as in the research by Jeevan Prabhu), whereas others have proposed new metrics (such as the structure, data complexity, or the component instability as defined by Marta Olszewska). Regardless of the selected technique, the goal is to analyze the impact of a change and overall quality of the model by analyzing various aspects, such as the number of blocks, number of connections, nesting level, and definition of data types. Tools also report metrics from the models--for example, sldiagnostics and its front-end report metrics of Simulink models.
Reducing Complexity of Models
As model-based systems evolve, they are modified, updated, and integrated with more components. Moreover, as more functions are now implemented using software, models become more complex (with many inter-connected components that have potentially conflicting requirements), which makes their verification, analysis, and maintenance harder. For these reasons, detecting system complexity as early as possible can help developers manage it and keep it below a critical threshold. The existing tools mentioned previously help designers by producing a single value that reflects the quality and complexity of a system. Hence, they are useful to manage system evolution. However, these tools do not detail how to reduce complexity and improve system quality. On the other hand, reducing complexity and improving system quality is the goal of having these metrics: ultimately, system stakeholders want to keep the quality of system artifacts under control and fix potential defects or reduce sources of complexity.
Among the contributors to complexity, a correct use of data types is particularly important. For example, to specify a command to an actuator, using an enumerated type with restricted values is more accurate than using a generic type (such as an integer). This might impact system quality and makes system analysis, testing and certification more difficult than when using restricted type (which can reduce the system state space). On the other hand, many models rely intensively on generic types, such as Boolean or integer, which are not appropriate when data values are limited, as with a system state or the value of a command. Modeling guidelines recommend using enumerated types as much as possible, but engineers don't often do so, and the resulting models lack data abstraction and incur system complexity (such as an increasing number of interfaces or states).
For example, consider a system with a component representing a door sensor sending the actual status (open or closed). Developers could take different implementation strategies:
- Using Boolean types: The block will send the status using two Boolean variables--one to indicate that the door is open, the other to indicate that the door is closed.
- Using enumerated types: The block will send the status using a single variable that indicates the status (open or closed).
Using the first method, both variables can be true, meaning that the door can be open and closed at the same time. Using the second method reduces the block complexity: it reduces the number of variables by 50 percent, and it ensures consistency because the sensor can report only one possible status. Using appropriate data abstraction provides many other benefits (such as strong type checking) and will definitively help engineers to reduce complexity and avoid errors. Applying such abstraction on real systems with hundreds of variables might not reduce system complexity by half, but it will have a significant impact.
The SEI is dedicated to helping organizations manage software complexity more effectively, especially for systems that must be maintained and upgraded over years, such as those in the avionics, aerospace or automotive domains. So far in this post, I have detailed the state of the art for reporting model complexity. I will conclude by introducing research now underway at the SEI to address the issue of managing model complexity.
As more developers of embedded systems adopt model-based methods, avoiding complexity as early as possible ensures that it does not propagate through the development process. I am collaborating with a group of SEI researchers who are actively working to identify the root cause of complexity in models and propose design alternatives to reduce complexity and improve system quality. The project will propose an approach to qualify and quantify complexity in models, ideally leveraging existing metrics and applying them in models while trying to propose solutions to re-design the system and adopt modeling patterns that will avoid complexity.
Our work will focus on using existing metrics (such as the cyclomatic complexity) on models but also find new ones to detect emerging complexity. For example, one idea is to focus on data abstraction (e.g., using enumerated types rather than generic ones, as explained previously). These metrics will then be reused by tools to help system designers propose implementation alternatives that avoid this complexity.
Through earlier detection of emerging model complexity our research aims to ensure that the issues related with complexity, such as rework costs and finding issues late in the development process, will not be propagated through the development process. Over the long term, it would reduce re-engineering efforts and costs associated with maintenance and testing activities.
We welcome your feedback on our research in the comments section below.
To read the paper, Simulink-Specific Design Quality Metrics by Marta Olszewska, please visit
To read the paper, Complexity Analysis of Simulink Models to improve the Quality of Outsourcing in an Automotive Company by J. Prabhu, please visit