Discussion for the End User use of OCR in PDF-XChange Editor and Viewer
Moderators:PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
I really like PDF-XChange Editor as a PDF editor, a set of utilities and an OCR tool. Moreover, Abbyy does not have a tool to batch add an invisible text layer to PDF documents (only batch recognition with full merging of all layers of source documents). However, the recognition quality of the engine used by Tracker Software is noticeably worse than the one currently used by Abbyy.
͏ https://drive.google.com/drive/folders/1CjVs87-ppL9gbG-OUD9yNyLLJZFMDM1T
͏
Do you plan to improve the recognition algorithms used in your products in the near future?
Last edited by Jensen Head on Sun Dec 14, 2025 10:29 am, edited 1 time in total.
I did take a look at your samples - and indeed the file you provide shows some incorrect recognition, however I did get a perfect result using our Enhanced OCR (ABBYY based):
image.png
So can you please make sure that you were using the Enhanced OCR, and share the settings you tried in there?
I got the above result with these settings:
Stefan
image1.png
Kind regards,
You do not have the required permissions to view the files attached to this post.
I re-converted the scanned page to PDF and OCR with the following settings:
_
2021-11-08_16-38-07.png
_
The text copied from the resulting document differs from the document obtained in Abbyy Finereader by only a few spaces (extra spaces at the end of lines were in the abbyy document). I am at a loss to guess what was the reason for the low quality last time. I may have chosen the wrong set of languages.Or, as you suggested, the enhanced OCR mode has been disabled. Be that as it may, I am grateful for your help. The question is closed.
You do not have the required permissions to view the files attached to this post.
Last edited by Jensen Head on Tue Nov 09, 2021 8:28 am, edited 1 time in total.
Glad to hear that you now managed to get almost identical results!
We would also consider this closed, but if you have any other questions - you can always start a new topic!
ABBYY states on its "ABBYY FineReader Engine. The most comprehensive OCR SDK for software developers. Integrate AI-powered OCR features into your applications" page that the latest OCR engine version available for third-party applications is ABBYY FineReader Engine 12. I assume that in the latest versions of PDF-XChange in the "Enhanced" mode, FineReader Engine 12 is used. Which version of FineReader Engine used in FineReader PDF 16, and what are their significant differences for the user (if any)?
We use the FineReader version our license agreement with ABBYY allows.
I am not aware of what version they use in their own products - but it is slightly newer than the one we have access to.
Given that the Enhanced OCR is embedded in our own software - the differences will come down to recognition rate (as we do create our own UI - and e.g. Fine Reader 15 and 16 might have UI differences that are not relevant for the comparison with our EOCR). There would likely be improvements in some languages - but this is usually the less frequently used ones, and European languages are usually quite good already.
Jensen Head wrote: ↑Mon Nov 08, 2021 1:49 pmThe text copied from the resulting document differs from the document obtained in Abbyy Finereader by only a few spaces (extra spaces at the end of lines were in the abbyy document).
I tested the OCR of the screenshot with the list of Windows 10 services, and came to the conclusion that such images FR (Build 16.0.14.6564; Part # 1435.8) recognizes better than PDF-X (10.3.1, Build 387). The most common OCR error in PDF-X is spaces inserted in the middle of words:
2024-07-20_18-41-02.png
However, some lines of PDF-X recognize better than FR, which is unexpected. And these are not random errors. There are enough of them to consider statically significant. You can even say that in some situations PDF-X' OCR is more precisely than FR' OCR. I hope they will never know about it =)
____________________ Another example
PDF-XChange 10.5.2, build 395 (Enhanced OCR)
PDFXEdit (2025-05-16 10-26-23).png
FR16:
%pn (2025-05-16 10-34-16).png
Map.pdf
You do not have the required permissions to view the files attached to this post.
Last edited by Jensen Head on Fri May 16, 2025 7:40 am, edited 1 time in total.
The extra spaces are probably due to the different version of the FR engine used, and the better recognition is likely due to some magic our devs are doing, as while the core of the OCR engine is based on FR - there are still some tweaks they can make on our side!