Delete and rerun OCR getting rid of residual nonsense text
Posted: Sat Nov 30, 2024 7:46 am
Deleting and rerunning the OCR of a PDF; is that possible?
You see even after using these settings
I am getting nonsense when I try copying text:
"estrogen therapy" becomes "snb\stro\\nbt\\r\py"
I suspect this is because of a poorly performed OCR.
How do you get rid of the residue from previously performed OCRs?
Even with new document and high accuracy checked the text I copy still looks nonsensical.
In the content pane the text does also appear in strange characters... We have discussed this in the past... maybe it has something to do with unsupported fonts...
viewtopic.php?t=43671&hilit=nonsense Even with "fine page content" it produces the very same text output? And I get the seemingly identical nonsensical characters when I try and copy text.
It looks a bit worse when the font size and type is changed which is why I prefer using "searchable image" anyways
Similar post
viewtopic.php?p=178873&hilit=REMOVE+OCR#p178873
You see even after using these settings
I am getting nonsense when I try copying text:
"estrogen therapy" becomes "snb\stro\\nbt\\r\py"
I suspect this is because of a poorly performed OCR.
How do you get rid of the residue from previously performed OCRs?
Even with new document and high accuracy checked the text I copy still looks nonsensical.
In the content pane the text does also appear in strange characters... We have discussed this in the past... maybe it has something to do with unsupported fonts...
viewtopic.php?t=43671&hilit=nonsense Even with "fine page content" it produces the very same text output? And I get the seemingly identical nonsensical characters when I try and copy text.
It looks a bit worse when the font size and type is changed which is why I prefer using "searchable image" anyways
Similar post
viewtopic.php?p=178873&hilit=REMOVE+OCR#p178873