OCR results from images in pdf

ArminS · Post by **ArminS** » Wed Apr 27, 2016 11:59 am

Hello

as you can see in the image the OCR results are not that good here. Is the text too small? When doing OCR, I activated English and German as the languages and high quality. Slightly bigger text on white background had better results but they were not good, too.

(To make the text visible, I moved the new layer to the bottom and removed text formation.)

Wed Apr 27, 2016 12:24 pm

Hello ArminS,

Indeed there were some issues with the OCR engine in 317.0. Please update to 317.1 where this should have been resolved.

Regards,
Stefan

ArminS · Post by **ArminS** » Wed Apr 27, 2016 12:51 pm

Ok thanks.

Wed Apr 27, 2016 2:00 pm

ArminS · Post by **ArminS** » Fri Jun 17, 2016 8:05 am

I finally tested the new version 317.1 of PDF XChange. It is way better than before. The right picture still contains ~3 mistakes per line and the 1:1 sized picture of the "About" has really weird results.
Left example uses the settings: High quality Language only English.
Right example uses the settings: High quality Language only German.

Post by **Will - Tracker Supp** » Fri Jun 17, 2016 9:10 pm

Hi ArminS,

Please try using Medium accuracy - as counter-intuitive as it is, Medium often produces better results.

Cheers,

ArminS · Post by **ArminS** » Mon Jun 20, 2016 10:01 am

Indeed, thanks. When I use the same purple image, the text results are nearly without any mistakes at all.

Mon Jun 20, 2016 10:55 am

Glad to hear that ArminS,

When you use "Medium" the OCR tool relies more on dictionaries, and when you use High - it tries to recognize each letter on it's own - so indeed for normal text - Medium gives better results. For other unusual strings (e.g. license keys of some ID numbers - letter and number combinations) - High might be better.

Regards,
Stefan

OCR results from images in pdf

OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf

Re: OCR results from images in pdf