DocsReplay
❍ Forensic Observability for AI Agents

Your agent finished.
But did it actually work?

ARGUS catches silent failures and traces root causes before you deploy — so broken pipelines never reach production.

~1 min setup
Argus Live Trace
01 · Get Started

Get started in 4 steps

Install · Instrument · Observe · Debug

01

Install Argus

Install the forensic observer.

$ pip install argus-agents
02

Wrap your graph

Wrap your graph with the watcher harness.

python
from argus import ArgusWatcher

watcher = ArgusWatcher()
watcher.watch(graph)
app = graph.compile()
app.invoke(initial_state)
watcher.finalize()
03

Login

Authenticate your workspace.

$ argus login
04

Launch replay

Launch the replay dashboard.

$ argus ui
02 · Pipeline
▸ Watch ARGUS catch a failure

Every node, traced.
Every silence, surfaced.

ARGUS watches every step of your agent pipeline during development and testing,catching silent failures before they ever reach production.

extract128ms

Data extracted successfully

enrich218ms

Entities enriched and structured

summarize1.34s

Silent failure:
placeholder returned

Root Cause

ARGUS detected this before it degraded downstream

validate87ms

Degradation detected in downstream step

respond--

Output at risk

03 · Replay
↺ Replay any run

See the truth.

Re-run from any step. ARGUS reuses everything that's already correct — so you only pay for what changed.

10x
Faster Debugging
Skip what already worked
40%
Lower Cost
Reuses saved state & results
Zero
Wasted Compute
Only re-run what matters
How Replay Works
  • Execute your pipeline
    ARGUS records every node output and state
  • Re-run from any step
    We load saved states up to that point
  • Only downstream runs
    Everything before stays cached and reused
Run · 8f9a-22b12.47s
Replay #2 (fixed parser)
Execution Steps
extract
enrich
summarize
validate
respond
Step 3: summarize (Failed)
{
  "summary": "...",
  "key_points": [],
  "entities": [],
  "confidence": 0.42,
  "note": "placeholder text"
}
Replay: summarize (Replayed)
{
  "summary": "AI regulation is evolving rapidly, with focus on safety and transparency.",
  "key_points": ["safety", "transparency"],
  "entities": ["AI regulation"],
  "confidence": 0.93
}
What we reused

States & outputs from completed steps

extractenrich2.47s saved
What we ran

Only steps after the selected point

summarize1.34s compute
04 · Features
Features

Built for engineers who ship.

Detect failures, trace root causes, and fix your pipeline —all before a single user is affected.

01

Silent failure detection

Catches issues that look successful but are actually wrong.

02

Root cause analysis

Traces degradation back to the exact step that caused it.

03

Semantic degradation signals

Understands meaning, not just errors.

04

Replay + diff

Re-run, compare, and verify every change.

0
Pipelines Analyzed
0
Silent Failures Caught
0
Root Causes Identified
0
Replay Recoveries
05 · FAQ
FAQ

Questions,
answered.

  • A node technically “succeeds” but returns degraded state — empty arrays, placeholder text, collapsed confidence scores, hallucinated tool outputs — that quietly poisons every downstream step.