Hello,
I wonder if there are some special system requirements for scanners which create PDF documents. We have an interesting phenomenon here: some scanned documents work fine with annotations and text functions and some are "rejected" by certain text functions. The function GetAllText seems to be especially "picky".
The command "Find" works sometimes even if GetAllText fails.
Attached are 2 sample PDFs, the one (scanned with a Canon scanner) works, the other one does not even allow the highlighting.
It is clear to me that not all scanned documents can be processed correctly, but I need to know which ones, why, what are the limitations etc. in order to inform our customers.
Thanks,
Anton.
Processing scanned PDF documents
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
- User
- Posts: 22
- Joined: Thu Apr 02, 2009 9:55 am
Processing scanned PDF documents
You do not have the required permissions to view the files attached to this post.
-
- User
- Posts: 664
- Joined: Tue Nov 14, 2006 12:23 pm
Re: Processing scanned PDF documents
Hi Anton,
Actually scanning gives an image of a page without the possibility to select any text.
I suspect that when you scan to your Canon_DR_2580C.pdf it also uses some OCR (Optical Character Recognition) program provided which creates selectable text in those PDFs.
At this time we do not offer an OCR solution - though we have been working on our own OCR library for the past 4+ years - it is not available yet as a commercial release.
Actually scanning gives an image of a page without the possibility to select any text.
I suspect that when you scan to your Canon_DR_2580C.pdf it also uses some OCR (Optical Character Recognition) program provided which creates selectable text in those PDFs.
At this time we do not offer an OCR solution - though we have been working on our own OCR library for the past 4+ years - it is not available yet as a commercial release.