Auto-Prov: Revolutionizing Anomaly Detection with...

Provenance graphs have long been the backbone of anomaly detection in system logs, but relying on manual rule engineering is like building a house of cards in a windstorm. Enter Auto-Prov, an innovative framework that uses large language models (LLMs) to automate the creation of these graphs. The result? A solid, adaptable system that promises to revolutionize how we interpret complex log data.

A New Era of Provenance

Auto-Prov tackles the limitations of traditional provenance graphs head-on. By embedding system-level functional attributes directly into the graphs, it allows anomaly detectors to learn from enriched data. This isn't just about making the graphs prettier. It's about fostering a deeper understanding of system behavior, enabling the detection of deviations with surgical precision.

Auto-Prov's secret sauce lies in its ability to cluster unseen log types and extract provenance edges efficiently. This is accomplished through automatically generated rules, a essential feature given the diverse and evolving nature of system logs. If the AI can hold a wallet, who writes the risk model? Auto-Prov deftly sidesteps this question by generating rules that adapt to new log types on the fly.

Inference and Interpretation

The framework doesn't stop at graph construction. By combining LLM inference with behavior-based estimation, Auto-Prov infers functional context for both known and novel system entities. This dual-pronged approach provides a level of interpretability previously unseen in anomaly detection systems.

summarizing attacks, Auto-Prov translates detected anomalies into clear, natural-language text. This is a major shift for analysts who need to understand and respond to threats quickly. The days of sifting through raw data for insights are numbered.

Benchmarking Auto-Prov

Auto-Prov has been put through the paces with four state-of-the-art provenance graph-based detectors. The results are telling. Detection performance not only improves consistently, but the system also generalizes across various log formats. Show me the inference costs. Then we'll talk about the real value here: interpretability that remains stable, even as systems evolve.

Why should you care about Auto-Prov? Because it represents a shift in how we think about and implement anomaly detection. As systems grow more complex, we need tools that can keep pace. Auto-Prov does just that, providing clarity where there was once only noise.

Auto-Prov: Revolutionizing Anomaly Detection with LLM-Powered Provenance Graphs

A New Era of Provenance

Inference and Interpretation

Benchmarking Auto-Prov

Key Terms Explained