Print This Issue

One Step Ahead
April 28, 2009, Volume 55, No. 31

One Step Ahead

Another tip in a series provided by the Offices of Information Systems & Computing and Audit, Compliance & Privacy.

Sanitize Word, Excel and PowerPoint Docs Before Publishing

In 2004, the United Nations issued a report on Syria’s suspected involvement in the assassination of Lebanon’s former prime minister, Rafik Hariri. Recipients of some versions were able to track the editing changes, which included the deletion of names of officials allegedly involved in the plot, among them the Syrian president’s brother and brother-in-law.

Word, Excel and PowerPoint documents have complex, sophisticated data formats. They can contain a mixture of text, graphics, images, video, audio, tables, meta-data and more all mixed together. The complexity makes them potential vehicles for exposing information unintentionally, especially when earlier drafts might have had sensitive portions deleted, or might contain sensitive edits or comments.

Saving documents in Adobe’s Portable Document Format (PDF) is a common approach to try to sanitize documents, though it is difficult to do correctly. Office document metadata can sometimes be preserved as PDF metadata. Metadata is information stored within a document that is not evident just by looking at the document. It may include the name of the author; the names of individuals who have viewed and edited the document; the dates, times and durations of their edits and the location from which the document was accessed.

Some common mistakes authors and reviewers make are changing the font color to white, coloring text cells black or electronically pasting black rectangles over redacted text and then saving as PDF. In most cases, the redacted text can be recovered from the PDF.

As of this writing, the safest method to electronically sanitize a document is to copy it and paste it to a simple text editor like Windows Notepad or Mac TextEdit using the plain text format saved as a .txt file. (do not use Rich Text Format, RTF)  This method is not always desirable though, because all formatting, graphics, and tables are lost.

To sanitize more complex documents, see “How to minimize metadata in Office documents” http://support.microsoft.com/kb/223396 and “Redacting with confidence” www.fas.org/sgp/othergov/dod/nsa-redact.pdf.


To receive weekly OneStepAhead  tips via email, send email to listserv@lists.upenn.edu with the following text in the body of the message:  sub one-step-ahead <your name>.

For additional tips, see the One Step Ahead link on the Information Security website: www.upenn.edu/computing/security/.

Almanac - April 28, 2009, Volume 55, No. 31