How our PII Guardian protects submitters
An overview of the PII detection layer that masks emails, phone numbers, Turkish national IDs, IBANs, credit cards, and API keys before storage.
Why PII masking is non-negotiable
When a user submits an AI incident, they often include screenshots, transcripts, or file uploads. These may contain personal data — their own, or someone else's. Storing raw PII in our database is a regulatory and ethical risk.
What PII Guardian does
PII Guardian is a deterministic, edge-runtime-safe function that runs **before** any data is written to the database. It detects and masks:
- Email addresses
- Phone numbers (international and TR local)
- Turkish national IDs (TC Kimlik)
- IBANs
- Credit card numbers (with Luhn validation)
- API keys (AWS, Google, GitHub PATs)
- IP addresses
- URLs containing tracking parameters
The algorithm
Each pattern is a regex or finite-state machine. The masker runs in a single pass, replacing matched substrings with placeholder tags (e.g., `[EMAIL]`, `[PHONE]`, `[TC_KIMLIK]`).
The detection metadata is returned separately so the system can flag submissions that contain high-risk PII.
Auditability
PII Guardian is open source under AGPL-3.0. Anyone can audit the patterns, run their own test suite, and propose improvements. We believe trust infrastructure should itself be trustworthy.