Skip to content

Latest commit

 

History

History
73 lines (60 loc) · 8.1 KB

README.md

File metadata and controls

73 lines (60 loc) · 8.1 KB

RCAPapers

A collection of papers about Root Cause Analysis/Diagnosis/Localization in MicroService Systems, including invocation chain, multi-dimensional metrics and machine metrics.

Reference of paper notes: https://dreamhomes.top/

Survey

  • [2018 TSE] Fault Analysis and Debugging of Microservice Systems: Industrial Survey, Benchmark System, and Empirical Study. paper

Methods

Note: Different methods categoried by data type.

Metrics, Invocation

  • [2013 SIGMETRICS] Root Cause Detection in a Service-Oriented Architecture [MonitorRank]. paper
  • [2015 IWQoS] A methodology for root-cause analysis in component based systems. paper
  • [2017 TPDS] Failure Diagnosis for Distributed Systems Using Targeted Fault Injection. paper
  • [2018 IWQoS] Root Cause Analysis of Anomalies of Multitier Services in Public Clouds. paper
  • [2018 CCGRID] CloudRanger: Root Cause Identification for Cloud Native Systems. paper
  • [2018 ASE] Delta debugging microservice systems. paper
  • [2019 TSC] Microservices Monitoring with Event Logs and Black Box Execution Tracing. paper
  • [2019 Access] A Real-Time Trace-Level Root-Cause Diagnosis System in Alibaba Datacenters. paper
  • [2020 JSS] Graph-based root cause analysis for service-oriented and microservice architectures. paper
  • [2016 KDD] Ranking Causal Anomalies via Temporal and Dynamical Analysis on Vanishing Correlations. paper
  • [2017 ICDM] Ranking Causal Anomalies by Modeling Local Propagations on Networked Systems. paper
  • [2018 CCGRID] CloudRanger: Root Cause Identification for Cloud Native Systems. paper
  • [2018 ICST] Localizing Faults in Cloud Systems. paper
  • [2018 IPCCC] FacGraph: Frequent Anomaly Correlation Graph Mining for Root Cause Diagnose in Micro-Service Architecture. paper
  • [2019 ICWS] MS-Rank: Multi-Metric and Self-Adaptive Root Cause Diagnosis for Microservice Applications. paper
  • [2020 Appl. Sci.] A Causality Mining and Knowledge Graph Based Method of Root Cause Diagnosis for Performance Anomaly in Cloud Applications. paper
  • [2020 WWW] AutoMAP: Diagnose Your Microservice-based Web Applications Automatically. paper
  • [2020 IWQoS] Localizing Failure Root Causes in a Microservice through Causality Inference. paper
  • [2021 ICSE] MicroHECL: High-Efficient Root Cause Localization in Large-Scale Microservice Systems. paper
  • [2021 ISSTA] Faster, Deeper, Easier: Crowdsourcing Diagnosis of Microservice Kernel Failure from User Space. paper
  • [2021 SEKE] AAMR: Automated Anomalous Microservice Ranking in Cloud-Native Environment. paper

Metrics, Trace

  • [2017 WWW] Performance Monitoring and Root Cause Analysis for Cloud-hosted Web Applications. paper
  • [2018 ICSOC] Microscope: Pinpoint Performance Issues with Causal Graphs in Micro-service Environments. paper
  • [2019 FSE] Latent Error Prediction and Fault Localization for Microservice Applications by Learning from System Trace Logs. paper
  • [2019 ASE] Root Cause Localization for Unreproducible Builds via Causality Analysis over System Call Tracing. paper
  • [2019 ASPLOS] Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. paper
  • [2020 MLArchSys] Sage: Leveraging ML To Diagnose Unpredictable Performance in Cloud Microservices. paper
  • [2020 ISSRE] Unsupervised Detection of Microservice Trace Anomalies through Service-Level Deep Bayesian Networks. paper
  • [2020 ESEC/FSE] Graph-Based Trace Analysis for Microservice Architecture Understanding and Problem Diagnosis. paper
  • [2021 WWW] MicroRank: End-to-End Latency Issue Localization with Extended Spectrum Analysis in Microservice Environments. paper
  • [2021 ICSE] TraceLingo: Trace representation and learning for performance issue diagnosis in cloud services. notes
  • [2021 IWQoS] Practical Root Cause Localization for Microservice Systems via Trace Analysis. paper
  • [2021 ASE] AID: Efficient Prediction of Aggregated Intensity of Dependency in Large-scale Cloud Systems. paper
  • [2022 ICSE] Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning. paper

Metrics, Dependency graph

Note:Graph data includes System State Graph, Dependency graph and so on.

  • [2013 ICDCS] FChain: Toward Black-box Online Fault Localization for Cloud Systems. paper
  • [2019 ICPADS] ADGS: Anomaly Detection and Localization based on Graph Similarity in Container-based Clouds. paper
  • [2019 VLDB] GRANO: Interactive Graph-based Root Cause Analysis for Cloud-Native Distributed Data Platform. paper
  • [2019 JSS] Graph-based root cause analysis for service-oriented and microservice architectures. peper
  • [2020 NOMS] MicroRCA: Root Cause Localization of Performance Issues in Microservices. paper
  • [2020 ICSOC] Localization of Operational Faults in Cloud Applications by Mining Causal Dependencies in Logs using Golden Signals. paper
  • [2020 SoSE] Graph Based Root Cause Analysis in Cloud Data Center. paper
  • [2021 ICSE] MicroDiag: Fine-grained Performance Diagnosis for Microservice Systems. paper
  • [2021 ASE] Groot: An Event-graph-based Approach for Root Cause Analysis in Industrial Settings. paper

Software/Program faults

  • [2019 ISSTA] DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. paper
  • [2019 ASE] Root Cause Localization for Unreproducible Builds via Causality Analysis over System Call Tracing. paper
  • [2019 TSE] An Empirical Study of Boosting Spectrum-based Fault Localization via PageRank. paper
  • [2020 AAAI] Control Flow Graph Embedding based on Multi-Instance Decomposition for Bug Localization. paper