It is widely known that Microsoft Word (and specifically the Word .doc file format) is used for the preparation of documents, reports, notes and other formal and informal materials across the commercial, governmental and public sectors.

A second format, the PDF document, is also used pervasively in the business and government sectors to exchange files, publish to the web and for interactive content such as forms and multimedia.

The standard procedure for many years, across sectors, has been to convert Word documents to PDF format not only for ease of distribution, but to disallow continued editing while adding a layer of additional security to the document.

In many cases this process is enough; however, simply saving these documents to PDF format does not strictly guarantee security. Additional steps are required for the complete removal of possible sensitive, redacted or hidden information.

From the National Security Agency:

“Despite this common use of PDF documents, users who distribute these files often underestimate the possibility that they might contain hidden data or metadata. This document identifies the risks that can be associated with PDF documents and gives guidance that can help users reduce the unintentional release of sensitive information.“

Word Document Sanitization Basic Procedure

  1. Create a copy of the document
  2. Turn off reviewing features and remove associated data
  3. Review and delete sensitive content
  4. Check redacted content and run document inspector
  5. Verify Acrobat conversion settings and convert

See Page 12 of this recommended NSA Document for detailed procedures

PDF Sanitization Basic Procedure

  1. Sanitize the Source File
  2. Configure Security Settings
  3. Run Preflight
  4. Run the PDF Optimizer
  5. Run the Examine Document Utility

See Page 19 of this recommended NSA Document for detailed procedures

Additional Documentation

Redacting with Confidence: How to Safely Publish Sanitized Reports Converted From Word 2007 to PDF: http://www.fas.org/sgp/othergov/dod/nsa-redact.pdf

Hidden Data and Metadata in Adobe PDF Files: Publication Risks and Countermeasures: http://www.nsa.gov/ia/_files/app/pdf_risks.pdf

Remove tracked changes and comments from a document: http://office.microsoft.com/en-us/word-help/remove-tracked-changes-and-comme%20nts-from-a-document-HA101822263.aspx

Tools

Doc Scrubber: http://download.cnet.com/Doc-Scrubber/3000-2079_4-12599674.html

 

When releasing information to the public in a Word or PDF document, make sure that only the intended content is presented. Word’s ‘Inspect Document’ feature and Adobe’s ‘Examine Document’ tool supplement document review, but are not intended to wholly replace the redaction process. The sanitation processes outlined in this post reduce the likelihood of including hidden data, metadata and redacted content in the final Word or PDF file.