glossary

7 min readbeginner

Shadow Mode

Test governance controls on live traffic without enforcement — see exactly what would be blocked before flipping the switch

Key Takeaways

1Shadow mode evaluates governance controls without enforcing them — traffic flows normally while you observe what would be caught
2Essential for measuring false positive rates before enforcement disrupts production workflows
3Enables data-driven confidence: promote to enforcement only when detection accuracy meets your threshold
4CrewCheck supports per-rule shadow mode — test new rules individually without affecting existing enforcement

What Is Shadow Mode?

Shadow mode is a testing configuration where AI governance controls are evaluated against live traffic but not enforced. Requests pass through normally — nothing is blocked, masked, or modified — but the system records what would have happened if enforcement were active.

Think of it as a dress rehearsal for governance controls. You see the full picture — detection rates, false positives, policy impacts, latency overhead — without any risk of disrupting production workflows.

This is critical because governance controls that look perfect in testing often behave differently with real-world data. Customer support messages have different PII patterns than test data. Shadow mode reveals these gaps safely.

Why You Need Shadow Mode

Deploying governance controls directly to enforcement is risky. Here's what shadow mode reveals before you commit:

45%

Typical initial false positive rate

New PII rules often flag non-PII content — shadow mode quantifies this

3-5x

Detection rate variance

Real traffic often has 3-5x more PII than test datasets suggest

Production disruptions

Shadow mode never blocks or modifies traffic — zero risk to users

1-2 weeks

Recommended observation period

Enough time to see edge cases and seasonal patterns

Shadow Mode vs. Enforcement Mode

Understanding the difference between shadow and enforcement modes:

Shadow Mode (Observe)

Traffic flows normally — nothing blocked
Detections are logged but not acted upon
False positives don't affect users
Measures detection accuracy safely
No latency impact on responses
Can run indefinitely without risk

Enforcement Mode (Act)

Detected PII is masked before forwarding
Policy violations block or modify requests
False positives may disrupt workflows
Requires high confidence in accuracy
Adds detection latency to request path
Requires monitoring and incident response

The Shadow-to-Enforcement Pipeline

The recommended rollout process for new governance controls follows a graduated pipeline:

Stage 1 — Shadow on sample traffic (10%): Route a small percentage of traffic through the new control in shadow mode. Validate basic functionality and catch obvious issues.

Stage 2 — Shadow on full traffic (100%): Expand to all traffic in shadow mode. Measure detection rates, false positives, and latency across the full range of real-world inputs.

Stage 3 — Enforcement on sample traffic (10%): Once shadow metrics meet your thresholds, enable enforcement on a small percentage. Monitor for user-reported issues.

Stage 4 — Full enforcement (100%): Promote to full enforcement with confidence backed by data from stages 1-3.

Each stage should run for at least a few days to capture edge cases and traffic pattern variations.

Per-Rule Shadow Mode

Tip

CrewCheck supports shadow mode at the individual rule level, not just globally. This means you can have existing rules in enforcement while testing new rules in shadow — simultaneously.

Example: Your Aadhaar masking rule is in enforcement (proven accurate), while a new ABHA ID detection rule runs in shadow mode. The Aadhaar rule actively protects traffic while you validate the ABHA rule's accuracy.

This granular control is essential for continuous improvement — you're always testing the next rule without risking the controls that are already working.

Metrics to Watch in Shadow Mode

Key metrics to monitor during shadow observation before promoting to enforcement:

✗True positive rate — what percentage of actual PII is correctly detected?
✗False positive rate — what percentage of flagged items are not actually PII?
✗Detection volume — how many detections per hour/day? Is this expected?
✗Latency overhead — how much time does the control add to request processing?
✗Coverage gaps — are there PII formats or contexts that the rule misses?
✗Edge cases — any unexpected behavior with multilingual text, code, or structured data?

How CrewCheck Implements Shadow Mode

CrewCheck's shadow mode operates at the gateway level. When a rule is in shadow mode, the detection pipeline runs normally — extracting candidates, validating formats, scoring context — but the final masking step is skipped.

Instead, the detection result is logged to the audit trail with a 'shadow' flag. The governance dashboard shows shadow detections in a separate view, with metrics comparing what would have been caught versus what actually passed through.

Promoting a rule from shadow to enforcement is a single-click operation in the dashboard. The rule immediately begins masking detected PII, with the same detection logic that was validated during the shadow period.

Frequently Asked Questions

How long should I run shadow mode before enforcement?

At minimum 1-2 weeks on full traffic. This captures weekday/weekend patterns, edge cases, and gives you enough data volume for statistically meaningful accuracy metrics. For high-stakes rules, consider 4 weeks.

Does shadow mode add latency?

Minimal in practice, but we now describe it using the same production methodology as the main gateway. CrewCheck's current production measurement is sub-100ms gateway overhead at P95, reported separately from upstream provider time.

Can I shadow test on production traffic safely?

Yes — that's exactly what shadow mode is for. No traffic is modified, no requests are blocked. The only output is log entries showing what would have happened. There's zero risk to production users.

#shadow-mode#safe-rollout#testing#governance-deployment#false-positive-reduction

Continue Reading

Deepen your understanding with related concepts

Canary Deployment Policy Pack Circuit Breaker Trust Score

See Shadow Mode in action

Try CrewCheck's live governance demo — paste any text containing Indian PII and watch real-time detection, masking, and audit logging. No sign-up required.

Try Live Demo View Pricing