Demonstrating the Practical Utility and Limitations of ChatGPT Through Case Studies

White Paper
In this study, SEI researchers conducted four case studies using GPT-3.5 to assess the practical utility of large language models such as ChatGPT.

What is the practical utility of large language models (LLMs)? To answer this question, SEI researchers conducted four in-depth case studies. In each case study, they used a version of GPT-3.5 to a complete task based on prompts they provided. The case studies described in this paper span multiple domains and call for vastly different capabilities:

  • data science
  • training and education
  • research
  • strategic planning

For each case study, the researchers presented the unaltered transcripts generated through their interactions with ChatGPT, commented on the modes of interaction with ChatGPT, and noted its strengths and limitations in the context of the specific case study.

The researchers found that ChatGPT contributed to the quality of products generated and expedited their development. However, they found that ChatGPT did not eliminate the need for human involvement: knowledgeable people were needed to decompose complex tasks into simpler ones that ChatGPT could accomplish, and they needed to verify its outputs.