Software Engineering Institute | Carnegie Mellon University

SEI Insights

SEI Blog

The Latest Research in Software Engineering and Cybersecurity

Part one of this series of blog posts on the collection and analysis of malware and storage of malware-related data in enterprise systems reviewed practices for collecting malware, storing it, and storing data about it. This second post in the series discusses practices for preparing malware data for analysis and discuss issues related to messaging between big data framework components.

Citing the need to provide a technical advantage to the warfighter, the Department of Defense (DoD) has recently made the adoption of cloud computing technologies a priority. Infrastructure as code (IaC), the process and technology of managing and provisioning computers and networks (physical and/or virtual) through scripts, is a key enabler for efficient migration of legacy systems to the cloud. This blog post details research aimed at developing technology to help software sustainment organizations automatically recover the deployment baseline and create the needed IaC artifacts with minimal manual intervention and no specialized knowledge about the design of the deployed system. This project will develop technology to automatically recover a deployment model from a running system and create IaC artifacts from that model.

The growth of big data has affected many fields, including malware analysis. Increased computational power and storage capacities have made it possible for big-data processing systems to handle the increased volume of data being collected. In addition to collecting the malware, new ways of analyzing and visualizing malware have been developed. In this blog post--the first in a series on using a big-data framework for malware collection and analysis--I will review various options and tradeoffs for dealing with malware collection and storage at scale.

As part of an ongoing effort to keep you informed about our latest work, this blog post summarizes some recently published SEI reports, podcasts, and presentations highlighting our work in virtual integration, blockchain programming, Agile DevOps, software innovations, cybersecurity engineering and software assurance, threat modeling, and blacklist ecosystem analysis. These publications highlight the latest work of SEI technologists in these areas. This post includes a listing of each publication, author(s), and links where they can be accessed on the SEI website.

Well-known asymmetries pit cyber criminals with access to cheap, easy-to-use tools against government and industry organizations that must spend more and more to keep information and assets safe. To help reverse this imbalance, the SEI is conducting a study sponsored by the U.S. Office of the Director of National Intelligence to understand cyber intelligence best practices, common challenges, and future technologies that we will publish at the conclusion of the project. Through interviews with U.S.-based organizations from a variety of sectors, we are identifying tools, practices, and resources that help those organizations make informed decisions that protect their information and assets. This blog post describes preliminary findings from the interviews we have conducted so far. Our final report, which will include an anonymized look at the cyber intelligence practices of all the organizations we interviewed, will be released after the conclusion of the study in 2019.

For many DoD missions, our ability to collect information has outpaced our ability to analyze that information. Graph algorithms and large-scale machine learning algorithms are a key to analyzing the information agencies collect. They are also an increasingly important component of intelligence analysis, autonomous systems, cyber intelligence and security, logistics optimization, and more. In this blog post, we describe research to develop automated code generation for future-compatible graph libraries: building blocks of high-performance code that can be automatically generated for any future platform.

The size of aerospace software, as measured in source lines of code (SLOC), has grown rapidly. Airbus and Boeing data show that SLOC have doubled every four years. The current generation of aircraft software exceeds 25 million SLOC (MSLOC). These systems must satisfy safety-critical, embedded, real-time, and security requirements. Consequently, they cost significantly more than general-purpose systems. Their design is more complex, due to quality attribute requirements, high connectivity among subsystems, and sensor dependencies--each of which affects all system development phases but especially design, integration, and verification and validation.

Numerous tools exists to help detect flaws in code. Some of these are called flaw-finding static analysis (FFSA) tools because they identify flaws by analyzing code without running it. Typical output of an FFSA tool includes a list of alerts for specific lines of code with suspected flaws. This blog post presents our initial work on applying static analysis test suites in a novel way by automatically generating a large amount of labeled data for a wide variety of code flaws to jump-start static analysis alert classifiers (SAACs). SAACs are designed to automatically estimate the likelihood that any given alert indicates a genuine flaw.