Watchers
A Watcher is the core instrumentation primitive. It attaches to your pipeline graph and records everything that happens during execution — node inputs, outputs, state transitions, timing, and tool calls.
The Watcher doesn't modify your pipeline's behavior. It's a passive observer that hooks into execution callbacks. Your pipeline runs exactly as it would without ARGUS — the Watcher just records what happened.
from argus import ArgusWatcher
watcher = ArgusWatcher(
strict=False, # don't halt on detection
investigate=True, # run root cause analysis
persist_state=True, # save state for replay
)
watcher.watch(graph)Detectors
Detectors are the analysis engines that examine a trace after execution. ARGUS runs four detection layers, each looking for different categories of failure:
- 1.Statistical — anomalies in timing, output length, token counts, and other numerical signals
- 2.Semantic — meaning drift, relevance loss, and hallucination patterns using embedding similarity and LLM-as-judge
- 3.Behavioral — unexpected node transitions, infinite loops, skipped steps, and state corruption
- 4.Structural — schema violations, missing required fields, type mismatches, and contract breaches
Detectors run automatically when you call watcher.finalize(). You can configure sensitivity thresholds and enable/disable individual layers through the configuration.
Traces
A Trace is the complete record of a single pipeline execution. It contains:
- Every node that executed, in order
- Input and output state at each node
- Wall clock and CPU timing per step
- Tool calls and their results
- Detection results from all four layers
- Forensic analysis if failures were detected
# View the latest trace
argus trace --last
# View a specific trace by ID
argus trace abc123
# List all traces
argus trace --listTraces are stored locally in SQLite by default. See Storage for details on schema and export options.
Forensics
When detectors flag a failure, the Forensics engine kicks in. It traces the failure backward through the execution graph to find the root cause — which node, which input, which state transition caused the downstream degradation.
Forensic analysis answers three questions:
- ‣What failed? — the specific detection that fired and what it found
- ‣Where did it fail? — the node and step in the execution graph
- ‣Why did it fail? — the causal chain from the root cause to the observed symptom
Investigate mode
investigate=True (the default). Set it to "always" to run forensic analysis even when no detections fire — useful for catching near-misses.How They Connect

The flow is linear and deterministic:
- You create a Watcher and attach it to your graph
- Your pipeline runs normally — the Watcher records a Trace
- You call
finalize()— Detectors analyze the trace - If failures are found — Forensics traces back to root cause
- You view results via CLI, UI, or programmatic API
