Hello,
I’m using PDF-XChange Editor’s (10.7.3) Advanced OCR (FineReader) for German texts with many proper names and encounter the following issue:
Problem
Advanced OCR applies dictionary-based correction that changes clearly recognized characters. For example, “l” in proper names is often converted to “i.”
Workaround
When I set the recognition language to Latin or Spanish, letters are recognized correctly—but German umlauts (ä, ü, ö) and “ß” are lost.
Standard OCR recognizes all characters correctly on high-quality images but doesn’t offer “Fine Page Content.”
Questions
Can dictionary-based correction be disabled in Advanced mode?
Is there an alternative Workaround?
Can “Fine Page Content” preserve the original page as a layer that can be toggled on later?
Best regards
Mike
OCR Advanced (FineReader): Disable Dictionary-Based Correction
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
MikeTomsen
- User
- Posts: 2
- Joined: Sun Oct 19, 2025 9:02 am
-
Daniel - PDF-XChange
- Site Admin
- Posts: 12520
- Joined: Wed Jan 03, 2018 6:52 pm
Re: OCR Advanced (FineReader): Disable Dictionary-Based Correction
Hello, MikeTomsen
There is no way to disable dictionary correction, however there may be a workaround:
Does the same issue happen if you enable *both* the German language, and a Latin language? When you do so, the order in which you select them acts as a priority system, but allows both recognition functions to work in tandem.
And finally, no, fine page content cannot currently preserve the original page content in that way. I can check with the Dev team to see if such a feature is even possible, as I think this may be the first time I have seen such a suggestion.
[A quick update, in the meantime, you can use the "Overlay pages" tool on the organize tab, and specify the original document before saving the OCR output), to be added as a new layer, and then hide that layer manually. This should accomplish what you need with a few quick extra steps].
Kind regards,
There is no way to disable dictionary correction, however there may be a workaround:
Does the same issue happen if you enable *both* the German language, and a Latin language? When you do so, the order in which you select them acts as a priority system, but allows both recognition functions to work in tandem.
And finally, no, fine page content cannot currently preserve the original page content in that way. I can check with the Dev team to see if such a feature is even possible, as I think this may be the first time I have seen such a suggestion.
[A quick update, in the meantime, you can use the "Overlay pages" tool on the organize tab, and specify the original document before saving the OCR output), to be added as a new layer, and then hide that layer manually. This should accomplish what you need with a few quick extra steps].
Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
MikeTomsen
- User
- Posts: 2
- Joined: Sun Oct 19, 2025 9:02 am
Re: OCR Advanced (FineReader): Disable Dictionary-Based Correction
Hello Daniel,
Thank you for the feedback and your advice.
Unfortunately, changing the order of the languages doesn't solve the problem.
Only when I deselect German and select any other language with Latin letters, no dictionary correction takes place, and the characters are recognized correctly – except for the now missing German umlauts ä, ü, ö, and ß.
Apparently, the original Finereader software offers the option to create a custom dictionary and deactivate "Dictionary" in the options so that only the language's alphabet is used.
But my problem is very specific, as I have a text with mostly proper nouns. Therefore, I'm using your suggestion with the layers:
Layer 1: Original pages (I run standard OCR on this, and all letters and words are recognized correctly)
Layer 2: "Enhanced - Fine Page Content" OCR. This significantly improves readability, and I can live with the low error rate due to the dictionary corrections.
The main thing is that I can find the correct entries using the level 1 text search.
Best regards,
Mike
Thank you for the feedback and your advice.
Unfortunately, changing the order of the languages doesn't solve the problem.
Only when I deselect German and select any other language with Latin letters, no dictionary correction takes place, and the characters are recognized correctly – except for the now missing German umlauts ä, ü, ö, and ß.
Apparently, the original Finereader software offers the option to create a custom dictionary and deactivate "Dictionary" in the options so that only the language's alphabet is used.
But my problem is very specific, as I have a text with mostly proper nouns. Therefore, I'm using your suggestion with the layers:
Layer 1: Original pages (I run standard OCR on this, and all letters and words are recognized correctly)
Layer 2: "Enhanced - Fine Page Content" OCR. This significantly improves readability, and I can live with the low error rate due to the dictionary corrections.
The main thing is that I can find the correct entries using the level 1 text search.
Best regards,
Mike
-
Daniel - PDF-XChange
- Site Admin
- Posts: 12520
- Joined: Wed Jan 03, 2018 6:52 pm
Re: OCR Advanced (FineReader): Disable Dictionary-Based Correction
Hello, MikeTomsen
I am glad to hear you have a "working" solution, even if it is not ideal.
Would you perhaps be able to share the original files, and a screenshot of the problematic OCR settings configuration with us here, so we can investigate on this end and see where we can make improvements to the process?
Kind regards,
I am glad to hear you have a "working" solution, even if it is not ideal.
Would you perhaps be able to share the original files, and a screenshot of the problematic OCR settings configuration with us here, so we can investigate on this end and see where we can make improvements to the process?
Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com