Public Benchmark

CrewCheck India Safety Benchmark

Live evaluation results across Indian PII, prompt injection, hallucination controls, DPDP checks, and RBI-aligned safety coverage.

Current production measurement: sub-100ms added gateway overhead at P95. Total round-trip latency depends on the upstream LLM provider and is reported separately from CrewCheck overhead.

Scores update automatically on every commit.

Latest run

Checking latest run

Fetching data
Overall F1 across 242 labeled promptsWaiting for run data
PII flagship scoreWaiting for run data
Gateway overheadWaiting for run data
Last evaluatedFetching latest run