search menu icon-carat-right cmu-wordmark

Improving the Reliability of Expert Opinion within Early Lifecycle Cost Estimation

Robert Stoddard

As part of our research related to early acquisition lifecycle cost estimation for the Department of Defense (DoD), my colleagues in the SEI's Software Engineering Measurement & Analysis initiative and I began envisioning a potential solution that would rely heavily on expert judgment of future possible program execution scenarios. Previous to our work on cost estimation, many parametric cost models required domain expert input, but, in our opinion, they did not address alternative scenarios of execution that might occur from Milestone A onward.

Our approach, known as Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE), asks domain experts to provide judgment not only on uncertain cost factors for a nominal program execution scenario, but also for the drivers of cost factors across a set of anticipated scenarios. This blog post describes our efforts to improve the accuracy and reliability of expert judgment within this expanded role of early lifecycle cost estimation.

Our work in cost estimation began two years ago, building upon a review of existing cost estimation and expert judgment research. As an example, we identified an industry consultant, Douglas Hubbard, whose book, How to Measure Anything, presents an approach known as "calibrating your judgment" (my colleague, Dave Zubrow, describes Hubbard's technique in a recent blog post). Hubbard's focus on calibrating expert judgment using "trivial pursuit" exercises led to our team's decision to pursue research into the use of domain-specific reference points to further improve the accuracy and reliability of expert judgment within cost estimation.

Our research on early lifecycle cost estimation for the DoD consists of two tasks:

  1. Development of the QUELCE method, in which the probability of changes occurring in program execution is separated from the final assessment of the effects of such changes on the program cost estimate
  2. Critical thinking and designed experiments that would contribute to current research on expert judgment

We hypothesized that a domain-specific approach to calibration training and development of reference points would be necessary to reduce unwanted variation in judgments rendered by experts participating in the QUELCE method. We decided to take a two-pronged approach to improving expert opinion. The first part of the approach involved data mining of DoD program execution experience. The second part of the approach interviewed DoD experts about cost estimation and DoD program cost experience. One of our goals is to create an online repository of domain reference points that embodies the historical DoD program cost experience.

The repository will include a searchable database of reference points that helps domain experts exercise better judgment during cost estimation. Domain experts will be able to query the reference points using key words based on search technology. Search results will show the key reference points in relation to the domain and technology challenge. The domain expert(s) will then review those reference points before formulating judgement for the current cost estimation exercise. At this point in the project, we are mining reference points from DoD and other open-source data examples.

My colleague, James McCurley, has investigated DoD repositories for raw information that outlined why acquisition programs experience cost and schedule overruns. Our team compiled domain reference points from McCurley's data that identify selected changes associated with cost and schedule overruns. We are categorizing these changes into a set of common change drivers that are rooted in the various sources we have accessed.

One of the first sources we accessed was the U.S. Navy's Probability of Program Success (PoPS) program. The PoPS criteria came from studies of program performance used by the Navy to implement a step-by-step approval process for a program to continue, independent of--but aligned to--the DoD acquisition process. PoPS identified a number of categories of reasons for cost and schedule overruns in government programs.

PoPS was always seen as just one of many sources of programmatic factors that might provide information useful to QUELCE. The PoPS criteria are biased toward programmatic change issues (such as sponsorship, contractor performance, and program office performance) that are of primary concern to DoD sponsors and Program Executive Offices. As expected, when we started this project, we are finding the need to supplement PoPS with more technical change issues, such as those related to system engineering and integration factors.

Many technical change drivers may be seen in the Capability Based Assessment (CBA) activity performed by programs in preparation for the Milestone A decision. The CBA includes the Functional Area Analysis (FAA), Functional Needs Analysis (FNA), and Functional Solution Analysis (FSA). Another early source is the Analysis of Alternatives (AoA). These and other early documents often include information that identifies technical and programmatic uncertainties not captured in the cost estimation process, but which can be incorporated as program change drivers in QUELCE method. Consequently, many technical change drivers are rooted in artifacts that proposed programs must draft prior to Milestone A.

Examples of change drivers we've identified from various sources include:

  • Interoperability - a program is affected by changes from a dependent program
  • Contractor Performance - a subcontractor must be replaced
  • Obsolescence - a part is made obsolete before a program is operational
  • Technical Performance - either a technology is not ready for use or a technology fails to achieve key performance goals
  • Scope - the source of many changes, including new users, additional delivery targets, and extra platforms, all of which fall outside the realm of "code growth"
  • Funding - funding may be increased or decreased in DoD programs, often with little warning

Applying Our Expert Opinion Approach

In the coming year, our goal is to develop a database with information that supports experts implementing the QUELCE method. We will publish our approach to improving expert judgment and increasing and structuring their involvement in cost estimation using procedures similar to Team Software Process (TSP) scripts. Our goal is to ensure that domain experts have more active involvement with the cost estimation activity. The problem today is that domain expert results are often loosely coupled with the cost estimates. In contrast, QUELCE will facilitate domain experts systematically discussing change drivers and then mapping the change drivers explicitly to the cost driver inputs of traditional cost estimating models and estimating relationships (CERs).

In our approach, the domain expert will be prompted at different points throughout QUELCE to access the reference point database. These activities will consist of just-in-time virtual training for calibration within a given domain. For example, a domain expert may be participating in QUELCE to develop a cost estimate for a new communication system that involves satellite technology. If the domain expert has not recently completed virtual calibration training for that domain, he or she may receive a refresher course consisting of a two- to four-hour online exercise.

Our approach to improving expert opinion will help domain experts during the following three different points in the QUELCE method that depend significantly on expert judgment:

  1. Identifying pertinent change drivers. After completing the training, domain experts will be asked to participate in a workshop exercise that anticipates which change drivers will most likely be relevant to a particular program. In the workshop, domain experts will query for communication programs or specific technology names related to particular programs. In the example involving the new communication system above, the results should yield information related to historical communication programs or technologies and domain reference points, explaining why certain aspects went over budget or schedule.
  2. Populating the change driver cause-and-effect matrix. The second judgment point involves a change driver cause-and-effect matrix. The domain expert will evaluate each change driver and rate the probability, on a scale of 0 to 3, that the change driver will cause any other change drivers on the list to switch from a nominal to an off-nominal condition, thereby signaling the danger of cost and schedule overruns. This exercise requires judgment about the relationships between change drivers. The domain expert will get information from querying our repository before rendering this type of judgment. For example, reference points might include historical information about a change driver going off nominal and subsequently causing three other change drivers to go off-nominal. The reference points therefore give the domain expert a basis to understand the relationships between change drivers and help make them more accurate.
  3. Establishing probabilities for the Bayesian Belief Network (BBN). The BBN models the change drivers as nodes in a quantitative network, including probabilities that state changes in one node will create a state change in another node. Every change driver has a parent table that presents all the possible scenarios resulting from different combinations of its parent change driver states. For example, in our BBN, we have change driver A and change driver B, and both have an influence on change driver C. If change driver A has nominal and off-nominal states and change driver B has nominal and off-nominal states, there are four different combinations of parent change driver states, e.g. scenarios that may affect change driver C.

While our work to date has focused on calibrating expert judgment in the DoD cost estimation of program development, our approach could be applied to many situations beyond cost estimation. We envision this approach being used in domains such as portfolio management and strategic planning.


Our research into the QUELCE method for pre-Milestone A cost estimation represents a significant advance by enabling the modeling of uncertain program execution scenarios that are dramatically different from the traditional cost factor inputs of cost estimation models currently employed later in the DoD acquisition lifecycle. By synergizing the latest advancements in proven methods such as scenario planning workshops, cause-effect matrices, BBNs, and Monte Carlo simulation, we have created a novel and practical method for early DoD acquisition lifecycle cost estimation. If you're interested in helping us succeed in these efforts, please let us know by leaving a comment below.

Additional Resources

To read the SEI technical report, Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE) please visit

For more information about Milestone A, please see the Integrated Defense Life Cycle Chart for a picture and references in the "Article Library."

Written By
More By The Author

Why Does Software Cost So Much?

By Robert W. Stoddard

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed