Automating Container Minimization for the Edge
Created November 2024 • Updated March 2025
In the field, warfighters depend on secure, up-to-date software to support their strategic missions. Software packages are typically delivered in large, resource-demanding containers, but devices in the field often have limited resources available. To streamline the deployment of software for Department of Defense (DoD) organizations in edge environments, the SEI has developed an algorithm that streamlines containers so that they require less storage and bandwidth. This allows DoD organizations to field more capability per size, weight, and power (SWaP) at faster deployment speeds while also reducing software vulnerabilities.
The Challenge of Deploying and Using Demanding Containers in a Limited Environment
The DoD wants to use containers to support its vision of a cloud-to-edge continuum in which capabilities packaged as containers are pushed from the cloud to edge devices to support localized data processing. The tactical edge environment presents many challenges, including
- limits in storage space and computing power
- denied, degraded, intermittent, and limited-bandwidth (DDIL) networks
- high likelihood of bad actors trying to tamper with devices
Because container images need to be self-contained with all application dependencies, their use at the edge can clash with the constraints of edge environments. They are often significantly larger than they need to be, with much of their size wasted by unused or duplicated files. Such containers require greater transfer bandwidth and take a greater toll on device storage and the edge network. When the containers demand more size, weight, and power than the conditions of the edge can provide, new capabilities cannot be deployed. In addition, the larger the container is, the larger the number of vulnerabilities and consequently more attack surface is available for adversaries to exploit.
To address these challenges, the SEI created an automated Container Minimization Tool (CMT) to minimize the storage size of a set of container images. This technology reduces storage waste without negatively impacting functionality and advances the state of the art in deduplication across container images.
A Greedy Algorithm That Prunes and Deduplicates
There are two main sources of storage waste in container images: unused files (such as development files) and duplicated files (identical files that are stored in different layers). The CMT automates the process of pruning and deduplicating. Pruning removes unnecessary or unused files, and deduplicating combines shared files from multiple images into a common container layer. The applications in Images A, B, and C should run exactly the same after pruning and deduplication as they did before.
For the deduplication process, the CMT breaks up a set of container images into their individual files, reorganizes the layers, and reproduces a set of images and layers. Ultimately, this process reduces the storage and network costs of transferring these container images from the cloud to the edge. The algorithm considers operational costs (the cost of too many layers), storage costs (the cost of duplicate files in layers), and network costs (the cost of too few layers).
The SEI ran time-profiling and deduplication experiments on real sets of images. These experiments measured the effect of the number of files on the time the algorithm needed to deduplicate as well as the algorithm’s ability to deduplicate depending on the number of duplicated files. Test cases included ClearML, a machine-learning platform tool frequently used for experimentation, and Stan’s Robot Shop, an open source container project.
The results? The deduplication algorithm alone can reduce the storage required for container images and the bandwidth required to pull those images by 5–15% for multi-container deployments. When combined with pruning unused files, the deduplication algorithm can reduce container image storage by 10–30%. In either case, 100% of shared files (files used by two or more images) were deduplicated. The CMT can run these algorithms quickly, processing 10 images with 225,000 files in approximately 51 minutes.

Looking Ahead
The CMT can be utilized by any DoD organization to minimize their containerized software images. Minimized container images yield multiple benefits:
- They require less bandwidth to deploy across DDIL networks.
- They require less storage and network costs for cloud container repositories.
- They have lower counts of software vulnerabilities that are present in unused files.
Contact us today to learn how the CMT can help your team minimize and secure your container images.