What determines whether or not a comment will be indexed by

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

Athena
User
Posts: 17
Joined: Fri Mar 27, 2009 6:44 pm

What determines whether or not a comment will be indexed by

Post by Athena »

I have a number of documents that were initially generated from images of manuscripts or other historical documents to which I have added a comment box near the top of the page (as a kind of post-it). All of my documents are also tagged with metadata; which is makes it easy to find a given item when needed.

I've recently noticed that when I run a Windows Search for documents matching specific criteria, the text of the comments box sometimes appears in the hit list and sometimes it does not. It also appears that when it does appear, the actual text is also being indexed and therefore is searchable. I'm mystified as to what the difference is between those that do and those that do not display this data. What's the secret?

Thanks.

Athena
Last edited by Athena on Mon Mar 12, 2012 9:52 pm, edited 1 time in total.
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7445
Joined: Wed Mar 25, 2009 10:37 pm

Re: What determines whether or not a comment will be indexed

Post by Paul - PDF-XChange »

Hi Athena,

the original documents from images - have they had OCR run on them? That would account for the text from the body showing up in your searches. Coul it be as simple as some of your documents have not had OCR or otherwise have the text data in them to find?

Or are you meaning specifically the meta data is not consistent in the search results?
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Athena
User
Posts: 17
Joined: Fri Mar 27, 2009 6:44 pm

Re: What determines whether or not a comment will be indexed

Post by Athena »

None of these documents have been OCRed --at least not to my knowledge. And it's not content from the document that is (sometimes) being read by Windows, it's the text in comment boxes that I've added with PDFXchange.

I don't have any problem with the metadata in any of the documents. I thought I did at first because the search results looked "blank" for some but when I looked at the details pane, I realized that the subject and keywords were coming through fine for both categories of documents; it was just that some also included the annotations.

I just noticed this a couple of days ago when I added some freshly annotated documents to a folder then had to search for one. I was surprised to see the annotations in the search results...and then realized that it wasn't happening consistently.

Am I allowed to attach a screenshot here or perhaps send it to you?

Update: I just noticed that in one case, some sory of attempt at OCR appears to have taken place. I highlighted a sentence (Now that I think about it, that wouln't have been possible without it being converted to text) and gibberish is displayed in the hit list.


Athena
Last edited by Athena on Mon Mar 12, 2012 10:12 pm, edited 1 time in total.
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7445
Joined: Wed Mar 25, 2009 10:37 pm

Re: What determines whether or not a comment will be indexed

Post by Paul - PDF-XChange »

Hi av,

sure you can send screen shots - just put them in a zip archive and attach them to the post.

:-)
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Athena
User
Posts: 17
Joined: Fri Mar 27, 2009 6:44 pm

Re: What determines whether or not a comment will be indexed

Post by Athena »

Okay, here you go.

I've attached two files that appear to be identical -- US Census schedules each with a comment attached. As shown in the screen shot, Windows search displays the content of one text box but not the other. How are these two documents different?

Thanks
Athena
You do not have the required permissions to view the files attached to this post.
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3603
Joined: Thu Jul 08, 2004 10:36 pm

Re: What determines whether or not a comment will be indexed

Post by Ivan - Tracker Software »

If you press right mouse button on the file "B Hudson - 1920.pdf", select "Properties..." from the menu and switch to "Previous Versions" page of properties dialog for this file, do you see any records about file's previous versions?
PDF-XChange Co Ltd. (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.