KonomicKonomic
← Back to glossary

PDF Redaction

PDF redaction is the permanent removal of sensitive information from a document — not just covering it visually, but deleting it from the underlying file.

Why real redaction matters

The most common redaction mistake is drawing a black rectangle over text. This looks redacted but the text is still there in the file — anyone can select it, copy it, or use OCR to recover it. There have been major legal and government scandals where classified information leaked because of fake redaction.

True redaction deletesthe underlying content bytes and replaces them with a visible redaction mark (typically a black rectangle with "REDACTED" or a reason code).

What should be redacted

  • Social Security numbers and tax IDs
  • Credit card and bank account numbers
  • Medical information (under HIPAA)
  • Trade secrets and confidential business info
  • Personally identifiable information (PII) in court filings
  • Attorney-client privileged content
  • Children's names in legal documents
  • Witness identities in sensitive cases

Proper redaction workflow

  1. Make a working copy — never redact the original
  2. Identify everything to redact — don't skim; use search to find every instance
  3. Mark for redaction — select text, images, or regions
  4. Apply redaction — this step actually removes content
  5. Flatten the document — merge layers so redactions can't be undone
  6. Verify — open the redacted copy, try to select the blacked-out text, run OCR — nothing should be recoverable
  7. Strip metadata — file properties, revision history, and XMP data can leak info

Common redaction pitfalls

  • Black highlighter — just a highlight with black color, text is still there
  • Black rectangle annotation — a shape drawn on top, text still underneath
  • White text on white background — invisible but still extractable
  • Image redaction with transparency — black box might be semi-transparent
  • Forgetting metadata — file properties, comments, form fields
  • Incomplete search — missing one instance of an SSN in a 500-page file

Redaction vs blurring vs pixelation

Blurring or pixelation in images can sometimes be reversed using AI tools or specialized software. For genuine privacy protection, use solid black redaction boxes and verify the underlying data is actually gone.

Try it yourself

Permanently redact sensitive info from PDFs

Open tool