iFilter + OCR on Windows 7

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

SABESOFT
User
Posts: 31
Joined: Fri Oct 16, 2009 9:57 am

iFilter + OCR on Windows 7

Post by SABESOFT »

Is there any way to advise Windows Search (I am using Windows 7 64bit) to index also scanned .pdf files?
I read http://technet.microsoft.com/en-us/libr ... S.10).aspx that this is possible with TIFF files, but now nearly all scanned images are .pdf files, also the faxes that we get are in .pdf formats.
So it would be nice if the (as I understood) already present OCR Engine would work also on scanned .pdf files.

Sincerly
Bruno
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7444
Joined: Wed Mar 25, 2009 10:37 pm

Re: iFilter + OCR on Windows 7

Post by Paul - PDF-XChange »

HI SABESOFT,

PDF-XChange Viewer already has, through it's shell extensions, the ability to search the contents of scanned/OCR documents.

hth
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
SABESOFT
User
Posts: 31
Joined: Fri Oct 16, 2009 9:57 am

Re: iFilter + OCR on Windows 7

Post by SABESOFT »

And how can I check what words are extracted and indexed? I ask of course because I doesnt look that it works on my PC (checked with a really good looking scanned file).
Or is it necessary to activate tiff indexing before?
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7444
Joined: Wed Mar 25, 2009 10:37 pm

Re: iFilter + OCR on Windows 7

Post by Paul - PDF-XChange »

HI SABESOFT,

our shell extensions integration into the iFilter will search pdf files not tiff. It is a pdf file you are searching right? It should be as simple as using Windows search on a folder. If that folder has been indexed it will find the words. If not then hit the button "File Content" that should be available in the search results. Then even without indexing the folder it will find the string if present.

Try saving our manual to a folder then searching that folder for the string "Tracker Software", it should find it.

:-)
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
SABESOFT
User
Posts: 31
Joined: Fri Oct 16, 2009 9:57 am

Re: iFilter + OCR on Windows 7

Post by SABESOFT »

High Paul,

I understand that a full text search of a .pdf file like your manual works fine. Files like these works also good here with my system. But my original question was about .pdf files that comes from a scanner or a fax and contain ONLY grafics.
As an example I upload you an invoice whith a good quality, but where indexing does not work.

Bruno
You do not have the required permissions to view the files attached to this post.
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7444
Joined: Wed Mar 25, 2009 10:37 pm

Re: iFilter + OCR on Windows 7

Post by Paul - PDF-XChange »

Hi SABESOFT,

my apologies. I missed the key part that you mention image based documents. In this case there is no text data available to search and so nothing will be found.

If you add text to the document meta data these will be found. Open your document properties (CTRL+D) --> Additional Metadata and you will find places to add Title, description keywords and more. This can be useful for quickly finding documents you need based on this meta data.

If you do need to search on the content of the file you will need to employ some OCR (Optical Character Recognition) to the document to add the text data to be searched. We are developing an OCR solution for PDF-XChange and hope to have it ready early in the new year.

does that help?
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
SABESOFT
User
Posts: 31
Joined: Fri Oct 16, 2009 9:57 am

Re: iFilter + OCR on Windows 7

Post by SABESOFT »

Of course the answer helps, I know do not try to make work something that can not work.
You are talking that you are working for implementing a OCR in your product. Of course that ist OK for me, but is not easier for you (and your customer) to use the one present in Windows 7? I do not want a full working OCR, for my use an OCR that works in background for index use only is enaugh.
Bruno