Searching for text in a PDF which includes #
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Searching for text in a PDF which includes #
I have been using XChange viewer for some time and have always found it very useful. Kudos to the developers.
Just now, though, I needed to search several PDF documents for a string such as "lot #24" and it simply does not work. I have a copy of one of the documents printed out & I know the string is part of at least one document - in fact I am using it as a test/sample since similar strings were not found at all by XChange Viewer
Is it possible and what would I need to do to accomplish that?
My PDF XChange Viewer is version 1.5 build 318.1 running under Win 10 64-bit
TIA for any help
Just now, though, I needed to search several PDF documents for a string such as "lot #24" and it simply does not work. I have a copy of one of the documents printed out & I know the string is part of at least one document - in fact I am using it as a test/sample since similar strings were not found at all by XChange Viewer
Is it possible and what would I need to do to accomplish that?
My PDF XChange Viewer is version 1.5 build 318.1 running under Win 10 64-bit
TIA for any help
-
Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
Re: Searching for text in a PDF which includes #
Hi Searcher,
Thanks for the post - If the text is there in the document, the Viewer should find it without you needing to do anything particularly special. Can you please send a sample that we can take a look at here?
Thanks,
Thanks for the post - If the text is there in the document, the Viewer should find it without you needing to do anything particularly special. Can you please send a sample that we can take a look at here?
Thanks,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
-
Willy Van Nuffel
- User
- Posts: 2829
- Joined: Wed Jan 18, 2006 12:10 pm
Re: Searching for text in a PDF which includes #
I suppose that Searcher is using the Edit > Search feature of PDF-XChange Viewer, to search for lot #24 in the content of multiple PDF's.
A few things to verify here:
- be sure to introduce the correct path, where you would like to search
(in the "Search PDF" pane at the right, at "Where would you like to search")
- be sure to search in the "content" of your files
(in the "Search PDF" pane, click Options and verify if "Include Pages Content" is checked/activated)
- be sure to also search in sub-folders
(in the "Search PDF" pane, Options, check/activate "Look In Sub-Folders")
I did a test by putting "lot #24" in one of my PDF's and then using the Search in PDF-XChange Viewer.
There was no problem in finding the correct document. So, this should work correctly (in version 2.5 build 318.1).
Best regards.
A few things to verify here:
- be sure to introduce the correct path, where you would like to search
(in the "Search PDF" pane at the right, at "Where would you like to search")
- be sure to search in the "content" of your files
(in the "Search PDF" pane, click Options and verify if "Include Pages Content" is checked/activated)
- be sure to also search in sub-folders
(in the "Search PDF" pane, Options, check/activate "Look In Sub-Folders")
I did a test by putting "lot #24" in one of my PDF's and then using the Search in PDF-XChange Viewer.
There was no problem in finding the correct document. So, this should work correctly (in version 2.5 build 318.1).
Best regards.
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Re: Searching for text in a PDF which includes #
It will be a bit difficult without breaching some confidentiality or privacy constraints.
The text I am searching is a set of strata council minutes and I am not sure how I would remove the identifying parts of the file, period, let alone without altering things enough to mask the issue
But I will attach a screenshot
HTH
The text I am searching is a set of strata council minutes and I am not sure how I would remove the identifying parts of the file, period, let alone without altering things enough to mask the issue
But I will attach a screenshot
HTH
You do not have the required permissions to view the files attached to this post.
-
Patrick-Tracker Supp
- Site Admin
- Posts: 1645
- Joined: Thu Mar 27, 2014 6:14 pm
Re: Searching for text in a PDF which includes #
Hi Searcher
Thanks for posting this. Are you able to search any text in this document? Is this a scanned document? If so there are no text objects to search. You will first need to OCR the document via Document> OCR pages.
This places an invisible text layer on top of the scanned image allowing you to search and place text dependent annotations.
I hope this helps!
Thanks for posting this. Are you able to search any text in this document? Is this a scanned document? If so there are no text objects to search. You will first need to OCR the document via Document> OCR pages.
This places an invisible text layer on top of the scanned image allowing you to search and place text dependent annotations.
I hope this helps!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Re: Searching for text in a PDF which includes #
Yes, I can search for and find the numeric part of the text or any other text string I tried.Patrick-Tracker Supp wrote:
Thanks for posting this. Are you able to search any text in this document? Is this a scanned document? If so there are no text objects to search.
It is my understanding that these were text documents converted to PDF and not just graphics files
I had found this feature and tried it but, aside from not really being quite sure of what to expect, it did not let me find those #'sPatrick-Tracker Supp wrote: You will first need to OCR the document via Document> OCR pages.
This places an invisible text layer on top of the scanned image allowing you to search and place text dependent annotations.
-
Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
Re: Searching for text in a PDF which includes #
Hi Searcher,
Please post the PDF itself, otherwise I'm afraid that we cannot help as it would appear specific to this document.
Also, please be aware that the Viewer has been replaced by the Editor and discontinued, so it's unlikely that we will fix any issues with the Viewer except, possibly, those that we deem critical. The Editor can downloaded here:
https://www.pdf-xchange.com/PDFXVE6.zip
Thanks,
Please post the PDF itself, otherwise I'm afraid that we cannot help as it would appear specific to this document.
Also, please be aware that the Viewer has been replaced by the Editor and discontinued, so it's unlikely that we will fix any issues with the Viewer except, possibly, those that we deem critical. The Editor can downloaded here:
https://www.pdf-xchange.com/PDFXVE6.zip
Thanks,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Re: Searching for text in a PDF which includes #
As I have explained, I cannot post the full document. I also understand that that makes it impossible for you to track down the issue 
Using the 'Document Properties' feature in both the viewer as well as the editor you referred to does not show any application used to create the document. The only bit of information is that it claims to follow PDF 1.4
If there was a way to remove all identifying information, I assume I could send the document, but, that, of course, might alter it sufficiently to not show the problem.
However, while the viewer allowed me to at least find the numeric part of the strings I was looking for, the editor does not even find that much - using either Ctrl-F or Ctrl-Shift-F
Any reason for this? what am I missing - or is the free version unable to search?
Using the 'Document Properties' feature in both the viewer as well as the editor you referred to does not show any application used to create the document. The only bit of information is that it claims to follow PDF 1.4
If there was a way to remove all identifying information, I assume I could send the document, but, that, of course, might alter it sufficiently to not show the problem.
However, while the viewer allowed me to at least find the numeric part of the strings I was looking for, the editor does not even find that much - using either Ctrl-F or Ctrl-Shift-F
Any reason for this? what am I missing - or is the free version unable to search?
-
Willy Van Nuffel
- User
- Posts: 2829
- Joined: Wed Jan 18, 2006 12:10 pm
Re: Searching for text in a PDF which includes #
Hello,
I have made three different PDF's for you (included in the attached ZIP -file), for testing.
There is:
- a first one : Lot number 24 - text.pdf with purely the text "lot #24"
- a second one : Lot number 24 - image.pdf with only an image of "lot #24"
- a third one : Lot number 24 - ocred.pdf with as well the image as the text itself
If you should do a test with this in PDF-XChange Viewer 2.5.318.1 by using the Edit > Search feature, and using the settings as I mentioned in an earlier post in this topic, you should normally find the first and the third one.
The reason that it should not find the second one, is because it only contains an image, no text.
Can you (Searcher) confirm this?
If yes, than we could come to the conclusion that PDF-XChange Viewer is absolutely working correct and that the reason for not finding the text in "your" files is because it probably goes about PDF's with scanned text where maybe OCR has been applied but not 100% correct. You could do a check of this via PDF-XChange Editor and verifying this via the Content pane.
Best regards.
I have made three different PDF's for you (included in the attached ZIP -file), for testing.
There is:
- a first one : Lot number 24 - text.pdf with purely the text "lot #24"
- a second one : Lot number 24 - image.pdf with only an image of "lot #24"
- a third one : Lot number 24 - ocred.pdf with as well the image as the text itself
If you should do a test with this in PDF-XChange Viewer 2.5.318.1 by using the Edit > Search feature, and using the settings as I mentioned in an earlier post in this topic, you should normally find the first and the third one.
The reason that it should not find the second one, is because it only contains an image, no text.
Can you (Searcher) confirm this?
If yes, than we could come to the conclusion that PDF-XChange Viewer is absolutely working correct and that the reason for not finding the text in "your" files is because it probably goes about PDF's with scanned text where maybe OCR has been applied but not 100% correct. You could do a check of this via PDF-XChange Editor and verifying this via the Content pane.
Best regards.
You do not have the required permissions to view the files attached to this post.
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Re: Searching for text in a PDF which includes #
As expected, both the editor & viewer finds either 24 or #24 in the 2 documents expected.
With original file, the viewer will find other text, such as 'lot' but the editor will not. Since the viewer can find strings it seems to me that at least part of the content is 'text', rather than images.
However, I cannot understand why the viewer can find strings in documents which the editor cannot.
How would I access this feature:"PDF-XChange Editor and verifying this via the Content pane."
With original file, the viewer will find other text, such as 'lot' but the editor will not. Since the viewer can find strings it seems to me that at least part of the content is 'text', rather than images.
However, I cannot understand why the viewer can find strings in documents which the editor cannot.
How would I access this feature:"PDF-XChange Editor and verifying this via the Content pane."
-
Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
Re: Searching for text in a PDF which includes #
Hi Searcher,
Is it possible to redact any sensitive information using the Redaction Tool (Document --> Redaction)? If there's lots of it, then that might be too time consuming. If so, is there any other document that you can provide that you can reproduce the problem with?
Alternately, if possible, you can always send the document support@pdf-xchange.com and it will not be publicly available. The document will be held in the strictest confidence and will only be shared internally (and only when required). If that's still not possible, I understand.
Unfortunately you're right - without some kind of a sample we cannot troubleshoot the problem, as it would appear specific to the document(s) you're working with.
Cheers,
Is it possible to redact any sensitive information using the Redaction Tool (Document --> Redaction)? If there's lots of it, then that might be too time consuming. If so, is there any other document that you can provide that you can reproduce the problem with?
Alternately, if possible, you can always send the document support@pdf-xchange.com and it will not be publicly available. The document will be held in the strictest confidence and will only be shared internally (and only when required). If that's still not possible, I understand.
Unfortunately you're right - without some kind of a sample we cannot troubleshoot the problem, as it would appear specific to the document(s) you're working with.
Cheers,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
-
Willy Van Nuffel
- User
- Posts: 2829
- Joined: Wed Jan 18, 2006 12:10 pm
Re: Searching for text in a PDF which includes #
Remaining question:
How would I access this feature:"PDF-XChange Editor and verifying this via the Content pane."
Answer:
In PDF-XChange Editor:
- click the "View"-menu
- click "Other panes"
- click/activate "Content"
The "Content" pane should now appear at left of the window.
Each page will be preceded by a little white triangle.
Click onto this triangle to open the page content.
Now, you can look for "Text:" fields.
PDF-XChange Viewer and Editor should be able to find the content in these fields.
How would I access this feature:"PDF-XChange Editor and verifying this via the Content pane."
Answer:
In PDF-XChange Editor:
- click the "View"-menu
- click "Other panes"
- click/activate "Content"
The "Content" pane should now appear at left of the window.
Each page will be preceded by a little white triangle.
Click onto this triangle to open the page content.
Now, you can look for "Text:" fields.
PDF-XChange Viewer and Editor should be able to find the content in these fields.
-
Searcher
- User
- Posts: 6
- Joined: Sun Nov 20, 2016 10:23 pm
Re: Searching for text in a PDF which includes #
Thank you for explaining the usage; RTFM is not my forte, especially for features I am not aware exist and/or currently am not looking for.
In this case, however, you instructions provided the answer to the puzzle. Following your direction, the information shows that each page id indeed an image.
This however raises another question for me since with the viewer I can search for text and find some of it, except of course for those '#', while in the editor, I cannot find any of the strings I can find with the viewer.
Evidently, as I was unaware it could do, the viewer silently and behind the scenes - i. e. with my intervention - seems to do the OCR, while the editor does apparently not.
Can I, and if so, how, enable this same functionality for the editor?
TIA
In this case, however, you instructions provided the answer to the puzzle. Following your direction, the information shows that each page id indeed an image.
This however raises another question for me since with the viewer I can search for text and find some of it, except of course for those '#', while in the editor, I cannot find any of the strings I can find with the viewer.
Evidently, as I was unaware it could do, the viewer silently and behind the scenes - i. e. with my intervention - seems to do the OCR, while the editor does apparently not.
Can I, and if so, how, enable this same functionality for the editor?
TIA
-
Patrick-Tracker Supp
- Site Admin
- Posts: 1645
- Joined: Thu Mar 27, 2014 6:14 pm
Re: Searching for text in a PDF which includes #
Hi Searcher,
You can tell the Editor to Automatically OCR scanned documents when you initiate the scan through the Editor itself. Under the "Image Insertion Options" button within the Scan Properties, you will find the "Image Post-processing" section. There, you can choose the option to run OCR:

I suspect the search settings within the Editor may be stopping you from finding some information. I would recommend taking a look Under the search Pane> Options to see if there may be something there causing some issue.

As we have all said already, it would be a lot easier to tell you more precisely what is happening if we could see the document. Of course, we completely understand that this particular file contains private information - I hope whatever suggestions have or will be made help you to solve this.
HTH!
You can tell the Editor to Automatically OCR scanned documents when you initiate the scan through the Editor itself. Under the "Image Insertion Options" button within the Scan Properties, you will find the "Image Post-processing" section. There, you can choose the option to run OCR:

I suspect the search settings within the Editor may be stopping you from finding some information. I would recommend taking a look Under the search Pane> Options to see if there may be something there causing some issue.

As we have all said already, it would be a lot easier to tell you more precisely what is happening if we could see the document. Of course, we completely understand that this particular file contains private information - I hope whatever suggestions have or will be made help you to solve this.
HTH!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
-
Willy Van Nuffel
- User
- Posts: 2829
- Joined: Wed Jan 18, 2006 12:10 pm
Re: Searching for text in a PDF which includes #
Hello,
At my knowledge, there is nothing in the Viewer nor in the Editor that will do the OCR "without" your intervention, just by opening a PDF, or by searching in it.
So, when you open a same document in the Viewer or in the Editor, they both should see exactly the same content.
The only thing I can still think about, and that would make the difference between finding and not finding something from "lot #24", is that this text is somewhere in the "Document Info" and that maybe in the Viewer it is checked/indicated in the Search Options to look into this info, and that in the Editor it is not (see the last 'print screen' with concerning "Options" of Patrick).
N.B.: the "Document Info" can be found via the File-menu > Document Properties.
Regards.
At my knowledge, there is nothing in the Viewer nor in the Editor that will do the OCR "without" your intervention, just by opening a PDF, or by searching in it.
So, when you open a same document in the Viewer or in the Editor, they both should see exactly the same content.
The only thing I can still think about, and that would make the difference between finding and not finding something from "lot #24", is that this text is somewhere in the "Document Info" and that maybe in the Viewer it is checked/indicated in the Search Options to look into this info, and that in the Editor it is not (see the last 'print screen' with concerning "Options" of Patrick).
N.B.: the "Document Info" can be found via the File-menu > Document Properties.
Regards.
-
Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
Re: Searching for text in a PDF which includes #
Thanks Willy - you're absolutely right: neither the Viewer nor the Editor are capable of automatically OCRing a document when it has been opened or is being searched. The only way that the Editor can automatically OCR a document is when scanning, as per Patrick's instructions, but the Viewer did not have this feature.
@Searcher:
Thanks,
@Searcher:
I'm afraid that we really cannot say what the difference is at this point. If you're able to provide a document that doesn't contain privileged information, we'd be happy to take a look and assist further. But, until then, I'm afraid that there's nothing more we can do.This however raises another question for me since with the viewer I can search for text and find some of it, except of course for those '#', while in the editor, I cannot find any of the strings I can find with the viewer.
Thanks,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com