Page 1 of 1
Odd character spacing after OCR
Posted: Sun Jan 07, 2024 8:21 pm
by cakeandcustard74
Hi all,
There’s a strange thing that occurs to some words after using OCR, typically 2 to 4 lettered words, where the letters have unequal spacing between them. Is there any way to fix this?
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 9:58 am
by Dimitar - PDF-XChange
Hello cakeandcustard74,
Welcome to our forum.
Could you please send us one of the files you are having this problem with as well as a screenshot of how the entire page looks on your end?
Also - please do let us know if you are using the Standard or Enhanced OCR, and which build of the Editor is currently installed on your end (You can check the version under Help -> About inside the Editor).
Regards.
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 2:26 pm
by cakeandcustard74
Hi Dimitar,
I’m using Enhanced OCR, and the build is 383, version 10.1.3. Here’s a PDF with just a couple of pages I used OCR on (I haven’t proofread and edited so there’s some typos), and here’s a screenshot of one of the pages. The majority of words are fine, except for short words such as ‘who’ and ‘the’, which have unequal spacing between the letters.
Thus spoke Zarathustra.pdf
IMG_2628.png
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 3:27 pm
by Paul - PDF-XChange
Hi, cakeandcustard74
the sample you provided already has OCR on it and the spacing already there. Do you have a "Pre-OCR" version we can look at?
Kind regards,
Paul - Tracker Supp
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 4:20 pm
by cakeandcustard74
Hi Paul,
Here’s the original PDF with no OCR. In the original, the words I’ve underlined in the above screenshot are spaced normally, it’s just after OCR they get a bit weird. I’ve also tried using OCR on other pages, and the issue still occurs with short words, but like I stated in my other reply the majority of words are spaced fine.
Thus spoke Zarathustra original.pdf
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 4:51 pm
by Paul - PDF-XChange
Hi, cakeandcustard74
I see the same when I EOCR it here. I am reaching out to the OCR specialist to get his thoughts and will post here what we find.
Kind regards,
Paul - Tracker Supp
Re: Odd character spacing after OCR
Posted: Mon Jan 08, 2024 7:28 pm
by cakeandcustard74
Hi Paul,
Ah, thank you - I look forward to seeing what you find.
Re: Odd character spacing after OCR
Posted: Tue Jan 09, 2024 10:13 pm
by Paul - PDF-XChange
Hi, cakeandcustard74
the devs have to do some in depth investigation on this. It may take some time. I have raised a ticket around this so we can keep track of the progress. While for internal use only, should yo refer to RT#6740: OCR character spacing issue here any support staff member should be able to get you a status report.
I hope that helps.
Kind regards,
Paul - Tracker Supp
Re: Odd character spacing after OCR
Posted: Tue Jan 09, 2024 11:39 pm
by cakeandcustard74
Hi Paul,
Thank you for raising the issue with the devs; I’ll wait a few days and ask about it then.
Odd character spacing after OCR
Posted: Wed Jan 10, 2024 12:04 am
by Daniel - PDF-XChange