Are your PDFs Actually Redacted? Double Check!
Follow these steps to prevent users from reading sensitive data you thought you had hidden.
PDF is a very common file format for documents but one thing people may not realise is that the complexity the format provides can mean some things aren't quite as they seem. Specifically, I've seen numerous "redacted" documents where the user has drawn over the text and saved the file. However, under the hood, PDFs have layers which means that the block or marker you used is actually may actually be saved as a separate layer from the text it's covering. This means anyone who opens the file can simply move the block away and see the text that was underneath!
Drawing over the text that should be redacted:
After saving the file as "Seemingly_Redacted.pdf", and opening it in Acrobat, a user can easily remove the line to expose the text that should be hidden:
Adobe Acrobat Pro (paid versions) supports proper redacting of text but it can be too expensive to buy for people to justify their use, depending on how often you used advanced features. I simply had to use it to send a bank statement for proof of address but there was no need for the company to be able to see all my transactions on the statement. Fortunately, the free version of Adobe Acrobat can help with that.
Instead of saving the PDF by either using the "Save" option or "Save As" option, go to "File...Print" in Adobe Acrobat. Set the printer to "Microsoft Print to PDF" (or a similar PDF-related name). If you have comments that you would like appended within the document, you can click the "Summarize Comments" button:
Click "Print" and it should ask to save the file to a location and once you choose a location, it will create a PDF file that is now flattened. That's it! The text under your drawing is no longer visible by moving the line or block out of the way. This also means nothing can be edited, so it's advisable to use it once you have the final version you want to send as opposed to constantly flattening it after every edit.
Things to be aware of
Flattening PDFs can cause a decrease in quality but generally should retain a high enough quality for screen usage.
Flattening a PDF file can sometimes lead to errors or inconsistencies in the document, especially if there are complex graphics. Always proofread after to ensure everything looks as expected.
Finally, flattening can also affect OCR and accessibility tools making them harder to use. Consider your audience and requirements for flattening.
However, for a lot of use cases, the downsides to flattening are outweighed by the pros.
The paid version of Acrobat, and other readers, feature more advanced levels of redaction and such so depending on your requirements, it may be better value to use that. Additionally, PDFs offer other security aspects such as encryption and password-protection but this post was specifically for ensuring text that's been drawn over stays hidden.
Advanced usage using ImageMagick
For those who want to automate this, you can use a command line too like ImageMagick. The following command flattens the PDF with a DPI of 300. You can toy around with the DPI value if you need to, the higher the value, the bigger the file size.
convert -density 300 -quality 100 input.pdf -flatten flattened_output.pdf
I hope that helps, feel free to leave any comments or questions below.