Hi all,
Below is the successful Despeckle result again that I had got with the "Downsampling while Printing to PDF" method that Stefan had suggested:
David.P wrote: ↑Wed May 14, 2014 2:09 pm
PDF document before, not OCR'able:
After printing to PDF with PDF-XChange
(downsample from 300dpi b/w to 200dpi grey in order to get blurring):
Then finally, after OCR:

Now I just realized that there is the function "
Enhance Scanned Pages" in PDF-XChange Editor, which should do something similar, but possibly even better, and easier.
However, in my case this function, particularly the "Descreen" option, doesn't seem to do anything to the problematic pixelated text (that is almost impossible to OCR).
After applying "Descreening" this way, the text remains exactly the same as it was before, still containing the white dots:

Am I doing it wrong? If not, I would suggest to improve the Enhance Scanned Pages feature in order to be able to handle such problematic text that often is produced by scanners and fax machines.
One possibility to do so would be to apply a
Median filter to the image, which can produce results like the ones discussed further above:
David.P wrote: ↑Tue May 13, 2014 10:58 am
I have some large pdf files here that have been generated elsewhere by scanning paper documents. Unfortunately, the text looks like this:

With all of the OCR tools that I own, such text is not recognisable.
However,
after running a Median Filter (aka "despeckle") over the image (using IrfanView), the image looks like this:
... and is perfectly converted to text for example by Ad*be Acr*bat ClearScan:
Now the problem is, how can I batch-despeckle those PDF documents of several hundred pages size, such that I can OCR them afterwards with the tool of my choice?
I am attaching the example document that I have used in the above.
Thanks very much for considering adding a Median-like filter engine to the Enhance Scanned Pages function of PDF-XChange Editor!
Best regards
David
--
PS: I believe that this thread could be moved to the PDF-XChange Editor Forum because it actually deals with features of PDF-XChange Editor.