PDF redaction & PII verification
Make sure “redacted” really means redacted — and that no PII is hiding in plain text. Drop a file for a free structural scorecard — no signup.
Detects unapplied /Redact annotations + invisible text + PII candidates (SSN, credit-card, email, phone) in PDF text. A deterministic, determinable subset — not OCR; matched values are masked.
Botched redactions are a notorious, costly leak: a black box drawn over text, or an Adobe redaction mark placed but never applied, leaves the content fully extractable. Preflight’s Redaction & PII module scans each PDF for unapplied /Redact annotations and invisible text, and pattern-matches the extractable text for PII — SSN, credit cards (Luhn-validated), email, and phone — reporting each as a candidate with the value masked. It’s deterministic and privacy-first: nothing sensitive is stored, and image-only PDFs report the PII scan not-evaluable rather than a false clean.
How to verify a redaction
- Apply, don’t just mark — in your editor, apply/flatten redactions so the content is removed, not merely covered by a box.
- Remove hidden text layers — strip invisible OCR/text layers that still carry the sensitive content.
- Scan the extractable text — confirm no SSNs, card numbers, emails, or phone numbers remain selectable.
- Re-validate the final file before release — a black box is not a redaction until the text underneath is gone.
Frequently asked questions
Why isn’t a black box a redaction?
Drawing a filled rectangle over text leaves the text in the file — fully selectable, searchable, and extractable. Real redaction removes the underlying content, not just hides it.
What is an unapplied redaction?
Editors like Acrobat let you mark content for redaction; until you apply it, the content is still present. Preflight flags pending /Redact annotations so a "marked but not applied" file never ships.
Does Preflight detect PII?
Yes — it flags SSN, credit-card (Luhn-validated), email, and phone candidates in the extractable text, with the value masked. These are candidates: verify before acting.
Is my sensitive data safe?
Yes. Matched values are masked in every finding, and Preflight never stores or logs the raw content (tenant-isolated storage, 24-hour purge).
What about scanned (image-only) PDFs?
With no text layer there is nothing to extract, so the PII scan is reported not-evaluable — never a false "clean." Preflight does not OCR.