I am getting quite poor results from OCR.
I want to convert a scanned PDF document into an editable PDF, without changing anything at all, just have an editable reproduction, but the results are consistently inconsistent, with any hand written notes being converted to garbage.
If anyone has any ideas I would be most grateful, I have 50 installs waiting on the back of this.
Top image is original, the others are results I don't want
OCR Poor results
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
- User
- Posts: 5
- Joined: Thu Jun 06, 2024 2:01 pm
OCR Poor results
You do not have the required permissions to view the files attached to this post.
-
- Site Admin
- Posts: 2268
- Joined: Mon Jan 15, 2018 9:01 am
Re: OCR Poor results
Hello Crookie,
Welcome to our Forum.
The OCR tool is not designed to recognize handwriting, but if you could give us a copy of the original document we will see what can be adjusted to get better results.
Regards.
Welcome to our Forum.
The OCR tool is not designed to recognize handwriting, but if you could give us a copy of the original document we will see what can be adjusted to get better results.
Regards.
-
- User
- Posts: 5
- Joined: Thu Jun 06, 2024 2:01 pm
Re: OCR Poor results
This isn't just one document, this is just a sample I've been given.
We will be talking thousands, and it doesn't look like PDF-Xchange is up to it
We will be talking thousands, and it doesn't look like PDF-Xchange is up to it
-
- Site Admin
- Posts: 19913
- Joined: Mon Jan 12, 2009 8:07 am
Re: OCR Poor results
Hello Crookie,
Unfortunately the ABBYY Fine Reader engine that our Enhanced OCR uses is really focused on other types of text and handwritten recognition is not it's strength. Tesseract (the engine behind our standard OCR) - might be handling such text slightly better - so please do give that one a try as well. Unfortunately we can not really improve those OCR engines on our end, so if you have thousands of handwritten documents to OCR - we might not be able to fully help!
Kind regards,
Stefan
Unfortunately the ABBYY Fine Reader engine that our Enhanced OCR uses is really focused on other types of text and handwritten recognition is not it's strength. Tesseract (the engine behind our standard OCR) - might be handling such text slightly better - so please do give that one a try as well. Unfortunately we can not really improve those OCR engines on our end, so if you have thousands of handwritten documents to OCR - we might not be able to fully help!
Kind regards,
Stefan
-
- User
- Posts: 558
- Joined: Sat Dec 16, 2023 11:09 am
Re: OCR Poor results
Hi,
Not handwritten but I also have poor OCR results with this file. I would assume that the source has a quite bad quality but Microsoft Snipping Tool gave very good results. The difference is not even close.
PDFXCE
Language : French
Accuracy : Auto
Output : Fine Page content
Text extracted from Microsoft Snipping Tool text actions
Thanks for improving,
Not handwritten but I also have poor OCR results with this file. I would assume that the source has a quite bad quality but Microsoft Snipping Tool gave very good results. The difference is not even close.
PDFXCE
Language : French
Accuracy : Auto
Output : Fine Page content
Text extracted from Microsoft Snipping Tool text actions
Thanks for improving,
You do not have the required permissions to view the files attached to this post.
Major Stylus topics
- RemoveAnnotationsWithEraser T#6903
- MiniPopupMenuOnTextSelection T#6894
- AbnormalSpikes forum.pdf-xchange.com/viewtopic.php?p=179935&hilit=spikes#p179935
- ForceEraserPreview forum.pdf-xchange.com/viewtopic.php?t=42380
- RemoveAnnotationsWithEraser T#6903
- MiniPopupMenuOnTextSelection T#6894
- AbnormalSpikes forum.pdf-xchange.com/viewtopic.php?p=179935&hilit=spikes#p179935
- ForceEraserPreview forum.pdf-xchange.com/viewtopic.php?t=42380
-
- User
- Posts: 1372
- Joined: Mon Nov 15, 2021 8:38 pm
Re: OCR Poor results
Maybe you could add an option to switch between Tesseract and Abbyy in the standard menu?
You do not have the required permissions to view the files attached to this post.
My wishlist https://forum.pdf-xchange.com/viewtopic.php?p=187394#p187394
Disable SPACE page navigation, fix kb shortcut for highlighting advanced search tool search field, bookmarks with numbers, toolbar small icon size, AltGr/Ctrl+Alt keyboard issues
Disable SPACE page navigation, fix kb shortcut for highlighting advanced search tool search field, bookmarks with numbers, toolbar small icon size, AltGr/Ctrl+Alt keyboard issues
-
- Site Admin
- Posts: 7370
- Joined: Wed Mar 25, 2009 10:37 pm
Re: OCR Poor results
Hi, MedBooster
regards that sample we were sent, the issue is the original really is poor, even for human eyes!
I am afraid that switching engines "on the fly" so to speak has been rejected. That will remain in the settings as is I am afraid.
regards that sample we were sent, the issue is the original really is poor, even for human eyes!
I am afraid that switching engines "on the fly" so to speak has been rejected. That will remain in the settings as is I am afraid.
You do not have the required permissions to view the files attached to this post.
Best regards
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com