FR: Sanitize Document Tool to Preserve Text Content

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

User avatar
PHK
User
Posts: 1571
Joined: Tue Nov 24, 2020 4:02 pm

FR: Sanitize Document Tool to Preserve Text Content

Post by PHK »

As it is, the Sanitize Document tool does a great job of reducing file bloat, but at some cost. Text Content and link actions to external files seem to be lost.

There does not seem to be much documentation on what the Sanitize Document tool does and does not do but my experiments with it seem to indicate there is no way of avoiding removing text content. Yes, I know I can OCR the document to create that content but why not have the option of preserving text content or, at least, automatically re-OCRing as a final step in the Sanitize Document tool before it closes? I note that OCRing after sanitizing does not increase file size.

Of secondary interest to me is the cleansing of actions that involve removing link actions that call up document pages that are in files other than the operative one. And maybe others that I have not explored. I note, however, that links to other places on the open document do work. It would be nice to be able to have the option of preserving all link actions, too.

Thank you in advance for considering these suggestions.
All best,

FringePhil
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 12856
Joined: Wed Jan 03, 2018 6:52 pm

Re: FR: Sanitize Document Tool to Preserve Text Content

Post by Daniel - PDF-XChange »

Hello, PHK

The Sanitize document tool is not designed with data-preservation in mind, it is a redaction function, intended to remove potentially sensitive information. Adding an option to prevent removal of content, for a tool which was specifically designed to remove that content to begin with, is a bit counter intuitive when tools to do the same without that removal already exist.

If you are simply looking to reduce file size and do not wish to lose base content accessibility, than you should be using the "save as optimized" function instead. The categories it offers are much better at fine tuning the output, to ensure a reduction in file size, without losing base content.

Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
[email protected]
User avatar
PHK
User
Posts: 1571
Joined: Tue Nov 24, 2020 4:02 pm

Re: FR: Sanitize Document Tool to Preserve Text Content

Post by PHK »

TrackerSupp-Daniel wrote: Mon Aug 05, 2024 8:26 pm Hello, PHK

The Sanitize document tool is not designed with data-preservation in mind, it is a redaction function, intended to remove potentially sensitive information. Adding an option to prevent removal of content, for a tool which was specifically designed to remove that content to begin with, is a bit counter intuitive when tools to do the same without that removal already exist.

If you are simply looking to reduce file size and do not wish to lose base content accessibility, than you should be using the "save as optimized" function instead. The categories it offers are much better at fine tuning the output, to ensure a reduction in file size, without losing base content.

Kind regards,
Well, here I go again using tools to do things they weren't created for. The SD tool seems to reduce file sizes way more than the OF tool. Maybe it is more brutal. It certainly does not like graphic images, so one should stick to all-text docs if one were to use the SD tool.

P. S. to my original post above: if I export the file links before I sanitize and then import that mini-file after (which is not too burdensome), I get all my links and their actions, which is good.

But I appreciate your kicking my FR along and I hope something comes of it.
All best,

FringePhil
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 12856
Joined: Wed Jan 03, 2018 6:52 pm

FR: Sanitize Document Tool to Preserve Text Content

Post by Daniel - PDF-XChange »

:)
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
[email protected]