Detailed document statistics

Please post any ideas or requests for new features here for the End User Version of PDF-XChange (printer Drivers)

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange, Tracker - Clarion Support, John - Tracker Supp, Support Staff, moderators

Puffolino
User
Posts: 337
Joined: Wed Feb 09, 2011 1:06 pm

Detailed document statistics

Post by Puffolino »

I like the given statistics of a document (which can be seen before optimizing a document) but would like to get more detailed information.
Not sure what of these things could be calculated (every single point would be nice):
- pie chart to demonstrate the size distribution
- size for each page (maybe by showing a bar graph to detect "large" pages easily)
- something like a top 20 statistics (for example largest images in the document)

When optimizing a document, it would be interesting if some additional enhancements could be implemented somewhen:
- removing orphant layers only or empty Xforms
- intelligent selection of useful optimizing settings or showing indicators near options if optimizing would be useless or effective
- predicting the (theoretical) reduction when selecting an optimizing option
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 12030
Joined: Wed Jan 03, 2018 6:52 pm

Re: Detailed document statistics

Post by Daniel - PDF-XChange »

Hi, Puffolino

Thank you for the suggestions here, I will try to break down my replies per point here to make this simpler.
  • The pie chart representation may be possible, but in almost all cases, when looking at what a document consists of it would be ~99% of Fonts, Images, and "document overhead", then ~15 categories make up the remainder. As such this depiction would not be particularly useful in the vast majority of situations. If you ever do need to check what your document is primarily made up of, the "audit space usage" button in the "save as optimized" function is able to do so. After some discussion, the Dev team asked me to make a ticket on this, so I created on for placing the "audit space usage" button in both the Document properties window, and adding a custom command that can be placed anywhere on the UI:
    RT#5252: Add "Audit space usage" Command in other locations
  • Size per page would note be entirely possible as data is often not stored as part of the page in question. You could have the same set of stamps or form fields on multiple pages, these items are not actually stored on any one page, they are stored separately and the pages refer to them, as such I can say with a good amount of confidence, that this feature will not be implemented at this time.
  • As above, the "save as optimized" function offers an "audit space usage" function which is effectively a "top 20" of what content types your document consist of.
As for when optimizing, onto the next "point-by-point" breakdown.
  • The "discard hidden layer content" option in the "discard user data" category should already do this, but there would certainly be utility in removing empty layers. The Devs have said that this should be Doable, and we will look at offering it in the future. Another ticket was created here:
    RT#5253: Save as optimized "remove empty layers"
  • It is funny that you mention this, as we have been discussing the possibility of having an "easy mode" toggle for this function specifically just this past week. So far we have not come to a conclusion on how exactly to handle it, so I cannot make promises, but it is something we are looking into currently. Keep your fingers crossed!
  • Unfortunately predicting the theoretical output size/reduction is something that is simply not possible at the moment. The only way this could be accomplished is by running the entire optimization process in the background, and as I'm sure you are aware, if you have more than a single simple page in the document, this takes a long time.
I hope that this helped to explain why some of this cannot be done.

Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Puffolino
User
Posts: 337
Joined: Wed Feb 09, 2011 1:06 pm

Re: Detailed document statistics

Post by Puffolino »

Hi Daniel,

thanks for your detailed information, I understand how difficult the complex structure of a pdf file would be to transfered to a statistical overview. Anyhow the "Audit space overview" shows for me a "document property" information but not any indication for "optimizing a PDF"...
My observations are that in 1 of 10 cases an optimized PDF file is larger than the original and trying to remove fonts also make PDFs larger than smaller in many cases.

With the "Top 20" I thought about more detailed information of each topic, for instance showing the size of all (or the biggest) images in the document and on which pages they are located.

Interesting that your team also dicusses the possibilities for optimizing. Beside simplifying the process, like chose an optimizing level for each section (null, normal, extreme optimizing) some section may get even more advanced functionality. Image compressing is one point which seems to have different options within the optimizing dialog and the recompress image dialog which is not easy to understand.

Anyhow it looks for me that it won't be necessary to do a posting as your team knows the wishes of the users already before that :wink:

Cheers.

Just one point concerning the layers - when extracting pages all layer information is copied into the new document which is not fine.
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 12030
Joined: Wed Jan 03, 2018 6:52 pm

Re: Detailed document statistics

Post by Daniel - PDF-XChange »

Hi, Puffolino

Thank you for understanding, we will do our best to make this more user friendly for sure.
Regarding optimization sizes, we recently discovered a bug in the compression system we are using which resulted in a balloon effect for specific content items on certain occasions during recompression. We are working on that but do not have an ETA for when it will be resolved just yet.

For the top 20, that seems more reasonable, although I am unsure how useful it would actually be in practice. I will pass the idea along and let you know if a ticket is made. I have a hunch that this will be one of those requests that is far too much work for almost no benefit to us or the vast majority of users.

As for the layers item you mentioned, I have just done some further testing, it seems that not only is extracting pages affected, insert pages and moving pages through the thumbnails pane actions are also affected by this. I have reported this directly to the Dev team so that it can be addressed.

Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com