Compliance
How to Avoid Storing Aadhaar and PAN in Your Application Logs
Practical techniques to prevent Aadhaar numbers and PAN cards from appearing in application logs, LLM traces, and error reports — with code examples.
Why Aadhaar and PAN End Up in Logs
The most common path: a user pastes their Aadhaar number into a chatbot or form field. The application logs the full request payload for debugging. The log is sent to a third-party log aggregator (Datadog, Grafana Loki, CloudWatch). The log is retained for 90 days. You now have Aadhaar numbers sitting in a US-hosted log platform without the user's awareness.
Other common paths: error logs that capture form state when validation fails (the user tried to submit an invalid PAN — logged in full), API request/response logging middleware that captures all headers and bodies, and LLM prompt logs that include user-pasted content.
Pattern 1: Gateway-Level Redaction
The most robust approach: deploy a proxy gateway that scans all outbound traffic before it reaches logging infrastructure. Every log shipper, LLM API call, and analytics event passes through the gateway, which redacts PII using regex + validation patterns before the data ever reaches a third-party system.
CrewCheck implements this for AI traffic: all prompts are scanned for 40+ Indian PII types (including Aadhaar with Verhoeff validation and PAN with checksum verification) before reaching the LLM, and all responses are scanned on the way back. The audit log stores only the PII type detected and the action taken, never the raw value.
Pattern 2: Structured Logging with Field-Level Redaction
For application logs, switch from unstructured string logging to structured JSON logging. This lets you apply field-level redaction: fields named `aadhaar_number`, `pan_number`, `mobile`, and `email` are automatically masked before the log entry is written.
Example in Node.js using pino: configure a `redact` array with field paths. Example in Python using structlog: add a `censor` processor that replaces sensitive values. The key requirement: you must use structured logging everywhere — a single unstructured `console.log(requestBody)` bypasses all field-level redaction.
Pattern 3: Pre-Logging Sanitisation Middleware
Add request/response sanitisation middleware to your web framework. This middleware intercepts HTTP request bodies and responses before they're handed to logging middleware. It applies regex patterns for Aadhaar (\b[2-9]{1}[0-9]{11}\b after Verhoeff validation), PAN ([A-Z]{5}[0-9]{4}[A-Z]{1}), UPI (\w+@\w+), and replaces matches with [REDACTED].
The limitation: regex-only approaches have false positive and false negative rates. Aadhaar regex alone without Verhoeff validation will redact any 12-digit number. The production-grade approach uses validation-first matching: extract 12-digit sequences, validate with Verhoeff, only redact confirmed Aadhaar numbers.
Pattern 4: LLM-Specific Log Scrubbing
LLM applications have unique logging challenges. Users often paste sensitive documents into prompts (Aadhaar card photos described in text, bank statements, medical reports). Standard regex won't catch 'my Aadhaar is XXXX XXXX XXXX' if the user writes the number with spaces.
Use a combination of pattern matching (with normalisation — strip spaces before applying patterns) and semantic analysis (LLM-assisted PII detection on the prompt itself). CrewCheck implements both: regex patterns with normalisation for known formats, and a lightweight classifier for detecting novel PII patterns in free text.
Audit Your Current Log Exposure
Run a retrospective scan on your existing logs. Search your log platform for: patterns matching Aadhaar format (12 consecutive digits), PAN format (5 letters + 4 digits + 1 letter), and UPI format (text@text). Most teams find PII in logs within 5 minutes of starting this search.
Once found, rotate any affected credentials (if the log contains API keys), notify your DPO, assess whether DPDP Section 25 breach notification is required (personal data of Indian residents exposed to a third-party log service likely triggers this), and implement the preventative patterns above.
Compliance operational checklist
How to Avoid Storing Aadhaar and PAN in Your Application Logs should be reviewed as an operating control, not only as a reference article. The minimum checklist is a data inventory, a stated processing purpose, owner approval, PII detection at the AI boundary, redaction or tokenisation where possible, retention limits, vendor transfer records, and a tested user-rights workflow. This checklist gives engineering and compliance teams a shared language for deciding what must be blocked, what can be allowed in shadow mode, and what needs human review before production release.
For AI systems, the review should include prompts, retrieved context, tool call arguments, model responses, logs, traces, analytics events, exports, and support attachments. Many incidents happen because teams scan only the visible form field while sensitive data moves through background context or observability tooling. CrewCheck's recommended pattern is to place the scanner at the request boundary, record the policy version, and keep audit evidence that shows which identifiers were detected and what action was taken.
A practical rollout starts with representative samples from production-like traffic. Run a DPDP scan, sort findings by identifier sensitivity and blast radius, fix Aadhaar, PAN, financial, health, children's, and precise-location exposure first, then move to consent wording, retention, deletion, and vendor review. Use shadow mode when false positives could disrupt users, and promote to enforcement only after the exceptions have owners and expiry dates.
This page is educational and should be paired with legal review for final policy interpretation. The operational proof should still come from repeatable evidence: scanner results, audit exports, pull-request checks, policy configuration, and a documented owner for the workflow. That combination is what makes the content useful during buyer diligence, board review, regulatory questions, or an incident investigation.
Related pages
Check your own workflow
Run a free DPDP scan before this risk reaches production.
Scan prompts, logs, documents, and API payloads for Indian PII exposure, missing redaction, and audit gaps. Backlinks: learn hub, developer docs, pricing, and the DPDP scanner.