Posted on by High-Performance Computingin
The power and speed of computers have increased exponentially in recent years. Recently, however, modern computer architectures are moving away from single-core and multi-core (homogenous) central processing units (CPUs) to many-core (heterogeneous) CPUs. This blog post describes research I've undertaken with my colleagues at the Carnegie Mellon Software Engineering Institute (SEI)--including colleagues Jonathan Chu and Scott McMillan of the Emerging Technology Center (ETC) as well as Alex Nicoll, a researcher with the SEI's CERT Division--to create a software library that can exploit the heterogeneous parallel computers of the future and allow developers to create systems that are more efficient in terms of computation and power consumption.
As we look at computing trends in data centers available to developers, the move towards many-core heterogeneous CPUs shows no sign of abating. The majority of computers (such as smartphones and other mobile devices) contain heterogeneous hardware with multi- and many-core chips. Many existing software libraries, frameworks, and patterns, however, were not developed for large-memory, many-core, heterogeneous computing environments. Since software developers often aren't accustomed or trained to write software for many-core architectures, new hardware architectures aren't being used to their potential.
Complicating matters even further is the fact that many common software libraries for these environments are not designed for ease of use, but rather for efficient and optimal computing. Unfortunately, software developers haven't received the training necessary in parallel programming algorithms to best leverage the capabilities of these new architectures.
Foundations in Moore's Law
Our research approach traces its foundations back to Moore's Law. Many software and systems engineers abbreviate Moore's Law as stating that over the history of hardware, the processor speed on integrated circuits doubles every two years. Moore's Law actually states that the transistor density on microchips doubles every 18 months. In recent years, however, CPU manufacturers have focused less on clock speed and more on multi-core and special-purpose cores with deeper memory architectures.
"The performance of individual computer processors increased on the order of 10,000 times over the last two decades of the 20th century without substantial increases in cost and power consumption," noted Samuel Fuller and Lynn Millet in The Future of Computing Performance. Fuller and Millet also advocate, "Future growth in computing performance will have to come from parallelism. Most software developers today think and program by using a sequential programming model to create software for single general-purpose microprocessors."
From Gaming to High-Performance Computing: Graph Analytics for Everyday Users
While heterogeneous, multi-core architectures were once largely seen in gaming systems, the high-performance computing (HPC) community has also migrated to heterogeneous, multi-core architectures. These architectures allow for high-level computations, as well as three-dimensional physics simulations. While still a specialty field, the architectures witnessed in HPC systems will soon be widely available to the everyday user.
In June of this year, the International Supercomputing Conference released the Top500 supercomputer list. According to an article on the Top500 list posted on AnandTech, "the number of hybrid systems is actually down--from 62 on the last list to 54 now--but on the other hand the continued improvement and increasing flexibility of GPUs and other co-processors, not to mention the fact that now even Intel is involved in this field, means that there's more effort than ever going into developing these hybrid systems."
One phase of our research involves using HPC architectures to simulate future computer architectures and develop software libraries, best practices, and patterns that can be used by a broad community of software developers. Initially, we limited our focus to graph analytics, which are algorithms that operate on graphs, which do not have locality of reference, making it hard to parallelize operations on them.
Graph analytics are widely used in government, commerce, and science and can highlight relationships that might be obscured by data. One example of a system that can be represented as a graph is a social network, where the individual components are people and the connections they form represent social relationships.
For a reference platform, we are relying on the Graph500, an international benchmark started in 2010 that rates how fast HPC systems, test, traverse, and navigate a graph. Graph 500 is a benchmark similar to the Top500 referenced above. The Graph 500 is specifically designed to test graph algorithms because they're fundamentally different from easily parallelizable algorithms. We're starting with the algorithms defined by the Graph 500, and we're using that framework as a starting point.
Identifying and Validating Design Patterns
With the understanding that the development of patterns is primarily a bottom-up endeavor, we initially focused on reviewing patterns developed for homogenous, HPC patterns. These patterns will be culled from those developed by ETC researchers, as well as our collaborators in government, academia, and industry.
Validation is a critical process in this phase and we are using two, independent technical validation mechanisms:
The next phase of our work will focus on culling the homogenous HPC patterns that we audited to develop a library of templates and patterns that software developers, architects, and technology planners will use to effectively access and exploit future computing architectures. As stated previously, a greater utilization of resources means faster computation and possibly more efficient use of resources.
A Collaborative Approach
In this early phase of our work, we have been collaborating with researchers at Indiana University's Extreme Scale Computing Lab, which developed the Parallel Boost Graph Library. In particular we are working with Andrew Lumsdaine who serves on the Graph 500 Executive Committee and is considered a world leader in graph analytics.
Addressing the Challenges Ahead
We recognize that even if we achieve all of our milestones, our research will not yield a silver bullet. Programmers need niche skills to address some of the problems of multiple architectures. Our approach focuses on reducing the time that developers need to spend solving the problem of programming for heterogeneous architectures rather than fighting the computer with the hardware as the problem.
This challenge problem aligns with Emerging Technologies Center (ETC)'s mission, which is to promote government awareness and knowledge of emerging technologies and their application and to shaping and leverage academic and industrial research. We hope our research will enable programmers in government and industry to use the library of templates and design patterns that we develop to produce effective software for future computing systems. Our next step involves releasing this library to our stakeholders in the DoD, and to other users, via an open-source platform to enable them to effectively access and exploit future computing architectures.
While our initial phase of research focused on graph analytics in graphics processing units (GPUs), we will also investigate other hardware platforms in the future, including field-programmable gate arrays (FPGAs). We plan to develop a library that separates the concerns of graph analysis from the details of the underlying hardware architecture. Future work will focus on graphs, but add new hardware architectures to the mix such as FPGA and potentially distributed platforms.
If you are interested in collaborating with us on this research, please leave a comment below or send an email to email@example.com.
For more information about the Emerging Technologies Center, please visit
Visit the SEI Digital Library for other publications by Eric.