OCR - documents

PDF-XChange Viewer SDK for Developer's
(ActiveX and Simple DLL Versions)

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

joeitaliano
User
Posts: 89
Joined: Wed Dec 29, 2010 8:50 am

OCR - documents

Post by joeitaliano »

hello

i just bought your SDK for the viewer as i believe it is the greatest SDK to handle PDF files that i have ever seen

what i want to do is to bring up an PDF file and OCR some of the area of the PDF file by drawing a rectangle around the image.

the PDF i have is a legal document that is in effect a tiff file so the OCR is the only way to decode.

once i have decoded it i can then save its text context into my database application.

i know that your OCR is not available and it might be some time before however could you give me some idea as to when it will be available and so for viewer V3.

i can use some other OCR SDK to do this part however i would prefer to support you

can you guide me in the right direct please

many thanks and keep up the great work

joe
User avatar
John - Tracker Supp
Site Admin
Posts: 5223
Joined: Tue Jun 29, 2004 10:34 am

Re: OCR - documents

Post by John - Tracker Supp »

Hi Joe,

We have been working on an OCR SDK for sometime now and we are making progress - but I have to say I believe the relaity is we will not have a viable solution until later this year - we do have a beta SDK in testing now and whilst some results are encouraging it does show we still have some work to do, so for practicle reasons if you need is for the next 3-6 months I would reccomend you seek an alternative.
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.

Best regards
Tracker Support
http://www.tracker-software.com
joeitaliano
User
Posts: 89
Joined: Wed Dec 29, 2010 8:50 am

Re: OCR - documents

Post by joeitaliano »

hi john

not a problem.

any idea how i can create a rectangle around the image i want to ocr

any solutions in the sdk i can use

thanks
User avatar
John - Tracker Supp
Site Admin
Posts: 5223
Joined: Tue Jun 29, 2004 10:34 am

Re: OCR - documents

Post by John - Tracker Supp »

Hi Joe,

I think the rectangle question is best answered when you settle on the SDK you will use and will become apparent then ...

As far as actual SDK's are concerned that is far more difficult as there are several good ones out there - the cost I suspect is going to be the determining factor - hence the reason we embarked on the development of our own - they are are all very expensive if any good and whilst that may not be such an issue if your installtion volumes are lowish (provided they are large enough to justify the intial purchase) - because almost all require Royalties the sums can be huge if your distribution needs amount to any volumes ...
There are the usual suspects like Scansoft, Abbyy and LeadTools etc - but possibly SimplOcr might be your best bet cost wise ...

Do note we have no experience in terms of development with these - so it is not a reccomendation of any kind ...

Good luck and be interested in your thoughts once you have investigated.
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.

Best regards
Tracker Support
http://www.tracker-software.com
joeitaliano
User
Posts: 89
Joined: Wed Dec 29, 2010 8:50 am

Re: OCR - documents

Post by joeitaliano »

hi john

the guys at adobe are a bunch of very smart people but i am sure your guyers are even smarter

i have acrobat 9 pro and i have just discovered that it has an ocr text recognition option that converts my pdf file that is not editable into a editable document.

i have no idea how it does it but the nice thing is that i can now highlight the text from the original pdf (after conversion) and copy and paste the data to my database application.

now this is really cool

do you know how they did this as this is what i would like to make the viewer do

if i send you a pdf before and after can your guys look at it

thank

joe
joeitaliano
User
Posts: 89
Joined: Wed Dec 29, 2010 8:50 am

Re: OCR - documents

Post by joeitaliano »

john

just to clarify as i think i may have used the wrong words in my last post

the document i have is in pdf but it is a bit image of the document so i m not able to copy and paste the text from the pdf to another document unless it is first converted to text

the adobe ocr option does this but it leaves the same pdf on the screen but than allows me to copied and paste the pdf text.

joe
User avatar
John - Tracker Supp
Site Admin
Posts: 5223
Joined: Tue Jun 29, 2004 10:34 am

Re: OCR - documents

Post by John - Tracker Supp »

Hi Joe,

Thanks for the kind words - you smoothie :) Always appreciated !

the document is being OCR'd and then the text is embedded as a kind of layer in the background in a new file containing both - our PDF Tools SDK alllows you to embed the text - but as indicated previously at this time we do not offer the OCR element - this is what Adobe are doing ..

We have quite a few dev's in their app's using our products to embed the OCR'd text having first used an alternative to OCR the image and generate the required text for subsequent embedding - there are quite a few posts in the PDF-Tools SDK forum related to this and I would suggest you search or scan that forum for more detailed info on the suject - e.g. :

https://forum.pdf-xchange.com/ ... =44&t=7000

HTH
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.

Best regards
Tracker Support
http://www.tracker-software.com