The article introduces a novel structural causal model and algorithm designed for the rapid detection of anomalies’ root causes in threshold-based IT systems. The presented approach leverages causal discovery from historical data and uses a graph traversal technique when encountering new anomalies. The method has been validated to be correct when root causes are not causally related, with an agent-based extension proposed for relaxing this assumption. The results from extensive experiments show the superior performance of this method, even when applied to data from alternative causal models or real IT monitoring data.
Publication date: 9 Feb 2024
Project Page: https://arxiv.org/abs/2402.06500v1
Paper: https://arxiv.org/pdf/2402.06500