OSA: 4 Best Practices for Open Software Ecosystems
Many systems and platforms, from unmanned aerial vehicles to minivans and smartphones, are realizing the promise of Open Systems Architecture (OSA). A core tenet of OSA is the broad availability of standards and designs, the sharing of information between developers, and in some cases downloadable tool kits. In return for openness, a broader community of potential developers and applications emerges, which in turn increases adoption and use. Consequently, there is a trade-off. Openness is a two way street, allowing devious opportunities for cyber intrusion and attack and less-than-ideal code to enter the system (because of the mechanisms of OSA). This blog post briefly examines the potentials, good and bad, of OSA and reviews four best practices for open source ecosystems.
Open Systems vs. Open Source
Open systems embrace a wide set of developers and designs, given a defined set of readily available architectures with standards. "Open" refers to the standards incorporated in the architecture, as the community used for development might be a known set of players restricted and qualified, or might be a very large community. Examples of open systems include the Portable Document Format (and International Standards Committee responsible for the PDF), the Android environment, and the Army Common Operating Environment.
Open Source refers to software that is grown and developed by a diverse community, using a shared universally exposed design, governed by an open license.
Open system architectures can exist independent of open source code. Open systems can include software developed exclusively by a set of developers, but using an open set of standards.
- 1. Know Thyself, Know Thine Ecosystem
Open systems typically have an integrator, who takes the products of the developer community and fuses them into the productive unit for use by customers or users. This integration often includes checks for standards compliance. The synergy is the ecosystem that emerges from the OSA effort. In some cases, as in smartphones, users employ the environment created by the integrator to choose what developer products they wish to bring into their personal devices. In both cases, the security of the system at run time depends on the solidity of the process and engineering at architecture and/or development time.
Note that exploits often occur due to unforeseen errors or poor implementation. Poor implementation can also create a performance drag in open systems, where code may be absorbed but no longer used, or rarely used, and impede the performance of the integrated products. It is possible for software to be written entirely to system standards, but be poorly implemented. While a diverse community benefits the system with a breadth of developer expertise, a wider community also increases the chances of unintentional inclusions of less-than-optimal code.
Knowledge of open systems and their ecosystems is very important. The integrator must be fully aware of the standards and their implications for security. Once acquired, this knowledge must be embodied in process and practice. There is a concern that too much process will cool the hot community ecosystem, but this concern must be tempered by the cooling effect a security breach would have.
For example, each interface offers an opportunity for exchange, and the exact nature of the exchange can determine whether the interface is so open that malicious instruction can be fed into overly broad data field definitions. In open systems, these interfaces are readily available standards and provide the same opportunities to adversaries as they provide to the intended community.
An open system ideally defines not just the shell and format of the interface, but the intended behavior expected from the interface, i.e., not just the shell of the egg, but the yolk as well. How does the integrator know the implications of each standard interface? Before fusing developer products at run time or development time, is there a systematic way to determine if poor common coding practices exist? What are the implications of each standard?
One way to understand the implications of standards in a security and performance setting is to use well-understood, widely adopted standards from a broad base. The standards bodies are an ecosystem component, even if the integrator is not represented, on the controlling body. The broader the adoption of a standard and the longer the standard has been in its community, the more its performance and reliability will be understood, documented, and possibly repaired. A small, non-diverse standard adoption is more likely to miss issues with the standard and its implementation.
Another way to gain knowledge of both the standards and developer implementations is to have a solid testing process, including layers of test assets. Software that is shared in the system should be tested and have a tracked history of changes (versions) and issues. The more shared, the more it should be tested. It behooves the integrator to provide a sandbox for developers, so they will be able to quickly cycle testing, and use the sandbox to examine-off-the shelf inclusion. Additionally, integrators should develop interface stubs and testing tools and environments for developers to use on their own. Both the sandbox and toolset will prevent issues from being discovered at later stages of testing. As the system emerges, additional testing can quantify performance and risk. A last level of testing-at-scale in system integration testing should be used to shake out developer products before inclusion in the releases of the system.
Pooling solutions (i.e., adding solutions from the community to the overall available information and resources to the available resources for developers) including risk from the ecosystem into libraries, allows both code reuse and improvements in implementation, thus increasing and distributing knowledge of components. Libraries should include documentation of code elements or solution strategies used in the registered code. The registered code should undergo the compliance process, as indicated above, before inclusion. Every inclusion or change to the library code must be tracked to source and nature, so that vulnerabilities and limitations can be identified, and possibly removed. If a vulnerability or design that includes a performance limiter makes it into the library, the risk is that other members of the ecosystem will download and include the problematic element.
2. Helping Developers Improve their Software Engineering Processes
Like the integrator, the developer's process can prevent the opening of vectors to exploit. It is therefore in the best interest of the integrator to help the developer community create, test, and understand its implementations. Integrators can help developers in two ways:
- provide the developer examples and guidance in best practice and process, including guidance on workflow
- provide access to known developers to knowledge bases and manuals.
Such assistance to developers allows the integrator to offload testing to the broader community earlier in the development cycle when fix cost is lowest. One way to provide assistance is to provide a consistent set of tools to allow the developer to test versions of their software for standards and interface compliance, common exploits, unconscious inclusions, common coding missteps, and other potential attack vectors. Stubs for interfaces can allow developers to assure they will completely mate, follow standards, and leave no wiggle room for cyber assault. When tips and coding examples are shared, the developer community is enhanced and steered into the known minimum knowledge set.
If the developers use commercial-off-the-shelf (COTS) products, integrators should be familiar with that community and recommend which packages the developer should avoid or use. Remember that off-the-shelf components carry both the benefits and flaws in their construction, and they are readily available for attackers looking to find exploits.
Once applications or components are brought in from the developer community, there should be a standard battery of tests to confirm their compliance and safety. During development, these tests should be automatic: developer identification, compliance to standards checker, security scan (virus, exploits) checker, and a best practices checker. This round of checks is just a first layer of testing that should quickly and verbosely inform a developer if they meet the minimums for inclusion. These same checks should have already been performed by the developers using the tools provided by the integrator. Ideally, there is a subsequent stage in which additional checks are performed on a test bed. Modularity enables these checks.
In the long term, a plan must be in place for sustaining open system standards and processes. This plan may require a bank of funds or budgeting longer term. Standards in open systems often become obsolete and must therefore be upgraded or replaced. As standards age and are replaced, their communities shrink, and they drift into a static document, which ironically can result in vendor lock, thereby defeating a key benefit of open systems!
Processes are often included in the sustainment plan to account for cyber issues, but these processes often miss other standards changes. Procurers should define funding and responsibilities for changes in contracts, to avoid the "we aren't paid to fix this" complaint. To stay open and abreast of recent developments, acquirers and integrators of OSA must account for standards updates and implementation changes.
3. Trust No Box: Defining Adoption
Trust is a key element of using modules or applications developed by diverse parties in an open system. Nothing to be run on an open system can be trusted, whether in a virtual machine, or connected to an open system. All access and use must therefore be controlled. Adoption of elements for an open community must be in a defined process and defined container. The more modular a system is, the more open it may be. The more open the system, the more trust of the community becomes an issue. The parameters of use and access should be defined so that virtual entities and human entities can know permissions for behavior and access for each service, application, module, user, and element of the system. Developer modules should be cordoned off in a virtual box, allowed only permitted interfaces, and permitted access to processing and network resources as defined.
In many open systems, cordoning of modules is done using a virtual machine structure (The following links provide examples of two uses of virtual machines: one in DevOps, another in cloudlets). The module (service or application) runs only in a well-defined space, with well-known processing access limits, memory location and access limits, and controlled access to internal or external communications via defined and controlled protocols and ports. The module should access only what the integrator allows it to access. Resources should be accessed only via the interfaces defined in the space and limited via resources defined for the space. Instances are cordoned to just the box. Only the integrator-defined space should be allowed to instantiate, create, or destroy boxes.
Another key capability is to provide a monitoring package that observes the box at run-time and alerts and logs any activities not permitted. Ideally, if a module misbehaves, the overall system should either apply further restrictions on the module and alert or kill all processes in the module. Some open systems leave failure modes to user discretion; some just silently end the module and notify. A developer with a history of misbehaving modules should be disqualified for a period because some misbehavior is due to poor testing or misunderstood off-the-shelf component inclusion. A hint of malicious intent leads to ostracism. Modules cannot be allowed to become platforms for launching or managing assaults.
Logging is not optional, and log mining should be a routine part of the improvement process for an integrator and good developer. Likewise, instances of virtual machines, and users of virtual machines must authenticate before using system resources. Rogue virtual machines, boxes, must be controlled or killed. Processes can become zombies, consuming performance even if no ill intent is present, and a process log will show such consumers. In any event, the boxes must be in the rules set for the system, and all activity recorded. Logs can be used to improve systems, improve performance, and determine the source in the ecosystem of weak or suspect elements.
Encrypting storage inside boxes and on common resources is another measure to control security. Likewise, encrypting communications between boxes may in some instances improve security at the cost of overhead. Performance must be balanced against the security needs of the overall system.
4. Let them know about the Sharks and Alligators
Boards and blogs moderated by the integrator of an open system--linked to blogs and discussion for the protocols and standards used--can let developers know what to avoid as they develop, and allow integrators to learn of issues and exploits. While reluctance is understandable in releasing vulnerabilities, trusted developers should be aware of issues that may affect the security of their own builds for the open system as integrated. A healthy community might also involve in-person forums and discussions on security and cyber topics. Use these forums to spot adversaries, to short circuit social exploits.
Open systems have tradeoffs in the cyber realm, due to their very nature. A large ecosystem allows not only the potential for misbehavior, but also opportunities for more diverse inclusion and additional concepts. Measures can be taken to mitigate issues, while capitalizing on the benefits. Know your community, know the group's limitations, and benefit from the ecosystem's strengths.
To view the presentation, Open Systems - What's Old is New Again, please click here.
To view the SEI special report Development of an Intellectual Property Strategy: Research Notes to Support Department of Defense Programs, please click here.
To view the SEI technical report A Decision Framework for Selecting Licensing Rights for Noncommercial Computer Software in the DoD Environment, please click here.
To view the DevOps blog post, Container Security in DevOps, please click here.
To view the Department of Homeland Security report, Recommended Practice: Improving Industrial Control Systems Cybersecurity with Defense-In-Depth Strategies, please click here.
To view a Department of Defense description and tutorial for Program Managers on Open Systems, please click here.
To view the November 2015 Air Force Magazine article Open Missions Systems for the F-22 and F-35 please click here.