Papers
arxiv:2501.04976

On the Diagnosis of Flaky Job Failures: Understanding and Prioritizing Failure Categories

Published on Jan 9, 2025
Authors:
,
,

Abstract

The continuous delivery of modern software requires the execution of many automated pipeline jobs. These jobs ensure the frequent release of new software versions while detecting code problems at an early stage. For TELUS, our industrial partner in the telecommunications field, reliable job execution is crucial to minimize wasted time and streamline Continuous Deployment (CD). In this context, flaky job failures are one of the main issues hindering CD. Prior studies proposed techniques based on machine learning to automate the detection of flaky jobs. While valuable, these solutions are insufficient to address the waste associated with the diagnosis of flaky failures, which remain largely unexplored due to the wide range of underlying causes. This study examines 4,511 flaky job failures at TELUS to identify the different categories of flaky failures that we prioritize based on Recency, Frequency, and Monetary (RFM) measures. We identified 46 flaky failure categories that we analyzed using clustering and RFM measures to determine 14 priority categories for future automated diagnosis and repair research. Our findings also provide valuable insights into the evolution and impact of these categories. The identification and prioritization of flaky failure categories using RFM analysis introduce a novel approach that can be used in other contexts.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2501.04976 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.04976 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2501.04976 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.