Is there any way to advise Windows Search (I am using Windows 7 64bit) to index also scanned .pdf files?
I read http://technet.microsoft.com/en-us/libr ... S.10).aspx that this is possible with TIFF files, but now nearly all scanned images are .pdf files, also the faxes that we get are in .pdf formats.
So it would be nice if the (as I understood) already present OCR Engine would work also on scanned .pdf files.
Sincerly
Bruno
iFilter + OCR on Windows 7
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
Paul - PDF-XChange
- Site Admin
- Posts: 7444
- Joined: Wed Mar 25, 2009 10:37 pm
Re: iFilter + OCR on Windows 7
HI SABESOFT,
PDF-XChange Viewer already has, through it's shell extensions, the ability to search the contents of scanned/OCR documents.
hth
PDF-XChange Viewer already has, through it's shell extensions, the ability to search the contents of scanned/OCR documents.
hth
Best regards
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
-
SABESOFT
- User
- Posts: 31
- Joined: Fri Oct 16, 2009 9:57 am
Re: iFilter + OCR on Windows 7
And how can I check what words are extracted and indexed? I ask of course because I doesnt look that it works on my PC (checked with a really good looking scanned file).
Or is it necessary to activate tiff indexing before?
Or is it necessary to activate tiff indexing before?
-
Paul - PDF-XChange
- Site Admin
- Posts: 7444
- Joined: Wed Mar 25, 2009 10:37 pm
Re: iFilter + OCR on Windows 7
HI SABESOFT,
our shell extensions integration into the iFilter will search pdf files not tiff. It is a pdf file you are searching right? It should be as simple as using Windows search on a folder. If that folder has been indexed it will find the words. If not then hit the button "File Content" that should be available in the search results. Then even without indexing the folder it will find the string if present.
Try saving our manual to a folder then searching that folder for the string "Tracker Software", it should find it.

our shell extensions integration into the iFilter will search pdf files not tiff. It is a pdf file you are searching right? It should be as simple as using Windows search on a folder. If that folder has been indexed it will find the words. If not then hit the button "File Content" that should be available in the search results. Then even without indexing the folder it will find the string if present.
Try saving our manual to a folder then searching that folder for the string "Tracker Software", it should find it.
Best regards
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
-
SABESOFT
- User
- Posts: 31
- Joined: Fri Oct 16, 2009 9:57 am
Re: iFilter + OCR on Windows 7
High Paul,
I understand that a full text search of a .pdf file like your manual works fine. Files like these works also good here with my system. But my original question was about .pdf files that comes from a scanner or a fax and contain ONLY grafics.
As an example I upload you an invoice whith a good quality, but where indexing does not work.
Bruno
I understand that a full text search of a .pdf file like your manual works fine. Files like these works also good here with my system. But my original question was about .pdf files that comes from a scanner or a fax and contain ONLY grafics.
As an example I upload you an invoice whith a good quality, but where indexing does not work.
Bruno
You do not have the required permissions to view the files attached to this post.
-
Paul - PDF-XChange
- Site Admin
- Posts: 7444
- Joined: Wed Mar 25, 2009 10:37 pm
Re: iFilter + OCR on Windows 7
Hi SABESOFT,
my apologies. I missed the key part that you mention image based documents. In this case there is no text data available to search and so nothing will be found.
If you add text to the document meta data these will be found. Open your document properties (CTRL+D) --> Additional Metadata and you will find places to add Title, description keywords and more. This can be useful for quickly finding documents you need based on this meta data.
If you do need to search on the content of the file you will need to employ some OCR (Optical Character Recognition) to the document to add the text data to be searched. We are developing an OCR solution for PDF-XChange and hope to have it ready early in the new year.
does that help?
my apologies. I missed the key part that you mention image based documents. In this case there is no text data available to search and so nothing will be found.
If you add text to the document meta data these will be found. Open your document properties (CTRL+D) --> Additional Metadata and you will find places to add Title, description keywords and more. This can be useful for quickly finding documents you need based on this meta data.
If you do need to search on the content of the file you will need to employ some OCR (Optical Character Recognition) to the document to add the text data to be searched. We are developing an OCR solution for PDF-XChange and hope to have it ready early in the new year.
does that help?
Best regards
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
-
SABESOFT
- User
- Posts: 31
- Joined: Fri Oct 16, 2009 9:57 am
Re: iFilter + OCR on Windows 7
Of course the answer helps, I know do not try to make work something that can not work.
You are talking that you are working for implementing a OCR in your product. Of course that ist OK for me, but is not easier for you (and your customer) to use the one present in Windows 7? I do not want a full working OCR, for my use an OCR that works in background for index use only is enaugh.
Bruno
You are talking that you are working for implementing a OCR in your product. Of course that ist OK for me, but is not easier for you (and your customer) to use the one present in Windows 7? I do not want a full working OCR, for my use an OCR that works in background for index use only is enaugh.
Bruno