Zonal OCR and Other Questions To Get Me Started

PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Stefan - PDF-XChange

aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Hi There,

My company has purchased your PDF XChange Viewer SDK for use within our application to allow clients to save data they have entered into PDF Forms. It has been a great success and I can't say enough how happy I am that we chose this product.

I've been tasked with finding an OCR control that we can put into our application and after trying to get a hold of some of the major players (ABBYY, Nuance) and getting no response (they can't seem to be bothered with little 'ol me), I realized that Tracker Software has recently released an OCR module that we might be able to use. Also, competitor OCR modules seem to be priced ridiculously with obscene royalties, but don't get me started. Tracker Software on the other hand is on the entire other part of the spectrum and I would love nothing more than to make the Tracker Software OCR module apart of our software. You guys ROCK!

I've been trying to do some homework before posting to this forum, but I have to admit I'm stuck and thought I'd ask a few things rather than try to piece the puzzle together in my head. Questions:
  • I work for a Health care company. We need the OCR to be able to recognize medical terminology. I'm not sure how your language files work exactly, but I assume it's some sort of dictionary? Am I able to customize the language to get it to recognize weird words like, "Ophthalmological"? I thought I had read something on the forum here about some sort of tool that can add languages (and presumably customize languages)?
  • We need to be able to do Zonal OCR. I need to be able to visually select an area on the screen and say "OCR that region." I don't know where to start with this. With the PDF XChange PRO SDK (which includes the OCR module), is there a viewer that will support that type of behaviour? Can it be done with the XChange Viewer ActiveX control somehow? Any pointers in the right direction with this would be greatly appreciated.
  • I'm pretty sure this is the case but want to confirm: the OCR SDK will work fine on Windows XP?
Part of the struggle I'm having is figuring out the new SDK which I'm not familiar with: PDF XChange PRO, and the associated OCR module. I see some executable examples for OCR in the download, but I'm confused if there is a different viewer for the XChange PRO SDK than there is for the XChange Viewer SDK?

I work with PowerBuilder 11.5, win32. Unfortunately you don't have code examples showing how to do things with PowerBuilder, which is fine, but it's a bit of a struggle on my side to look at other examples and try to figure out how to make PowerBuilder do the same things, so every little bit of information you can provide to point me in the right direction is a great help.

Thanks in advance and have a great day,
-John
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

Hi John,

Thanks for your inquiry. A proper detailed response will take a bit of time so give me a few moments to write something up. In the meantime, the SDK does in fact allow you to do zonal OCR.

Please check back in the next half hour or so and I will have a detailed answer to all of your questions up.

-Walter
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

1. The short answer is that at the moment we do not provide a custom dictionary function although this is something we are seriously considering for the future. However we have developed and tested OCR for many scenarios including technical terminology that is not explicitely in the internal dictionary. OCR results accuracy is much more dependent on input (scan) quality than on the words themselves. In other words, we use a fairly expansive dictionary to guide recognition, in a somewhat stochastic / probabilistic way, but it does not seriously constrain it (as a strict spell-checker might).

I would recommend testing some of your input documents with the SDK examples (e.g. OCRtestapp.exe in the examples/bin directory of the PRO SDK Bundle installation), or with our free viewer's OCR capability. No OCR is perfect but I think you will find that with good input image quality our OCR results are very good.

2. Zonal OCR is a significant feature of our SDK; you can even design templates graphically with the template designer (or specify input regions programatically). We provide C++ source code (in a Visual Studio project) for this, and I have just noticed that we need to include a compiled example so I will ensure this is put into the next build. If you email [email protected] I can send you one in the meantime if you do not have the capability to compile it with visual studio. You could design a single page template and apply it to all pages of a document, or you could design a multi-page document template and apply it to multiple documents. The OCR SDK documentation contains more details on this but essentially you want to use the functions related to "Fields", like OCR_GetFields() and OCR_LoadTemplate()/OCR_SaveTemplate(). The compiled template designer is in the examples/bin directory of your PDF X-Change PRO SDK bundle installation.

3. Yes, it works fine with Windows XP.

The PRO Bundle contains the Active X and Simple Viewer SDKs as part of the bundle. You'd have to talk to sales ([email protected] or phone us with the number on the website) to talk about that, as I'm not familiar with that aspect of the business and there may be upgrade pricing available or something.

As for support, we provide timely support for all of our products, but we do not explicitely support PowerBuilder. Integrating into power builder will be up to you, but we will do our best to help as much as we can without promising PowerBuilder specific support.
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Thanks for the prompt reply, this is helpful.

For Zonal OCR, is there a way that I can visually select a region on the screen and get it to OCR only that region? Is there functionality in the viewer to support that type of behaviour? For example, draw a box on the screen representing the region that I can pull from the Viewer and push through as a parameter into the OCR?

No worries about PowerBuilder support -- I will make do as best as I can. I've managed to do it with your XChange Viewer so hopefully it won't be too much of a headache with the OCR.

Thanks,
-John
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

aitchisj wrote:Thanks for the prompt reply, this is helpful.

For Zonal OCR, is there a way that I can visually select a region on the screen and get it to OCR only that region? Is there functionality in the viewer to support that type of behaviour? For example, draw a box on the screen representing the region that I can pull from the Viewer and push through as a parameter into the OCR?

No worries about PowerBuilder support -- I will make do as best as I can. I've managed to do it with your XChange Viewer so hopefully it won't be too much of a headache with the OCR.

Thanks,
-John
Yes, you can do this visually, but it is not a feature of the viewer. There is an example project in the SDK distribution called "OCRTDesigner". You will have to compile it yourself, however I will ensure that a compiled version becomes available shortly.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

Here is the template designer binary for your convenience.

You will need to ensure that ocrtools.dll and pxcview.dll (part of the pro SDK bundle) are in the executable's directory before running it. If you are using a trial version of the ocrtools.dll you will be notified that you can only OCR template fields from the first two pages of any pdf.

The sample input files are a template file (*.pxt) and a pdf. The template designer works with .pxt templates as the primary document and the PDF load functionality lets you load a PDF "underneath" the template. This is to allow you to preview what the same template may look like with different input PDFs.

Keep in mind that this application is not an end-user product but instead is a quick and dirty sample mainly intended to show how to use the SDK (and also for convenience in designing templates).

You can use the OCR test function to test OCR from any template field, however you must set up the OCR options using the settings dialog first. The language data directory is that which contains the "ocrdats" subdirectory, so that if you have your language files in z:\ocrstuff\ocrdats\*.dat you would point the language data directory at "z:\ocrstuff".
You do not have the required permissions to view the files attached to this post.
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Thanks a lot for this. I will take some more time to investigate it and then probably have a few more questions as I try to fill in the blanks on my side of the fence.
-John
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

aitchisj wrote:Thanks a lot for this. I will take some more time to investigate it and then probably have a few more questions as I try to fill in the blanks on my side of the fence.
-John
By all means - don't hesitate, that's what we are here for.

-Walter
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

I know you have said that you don't explicitly support PowerBuilder, but I've hit a snag and thought I'd ask. I've followed through your VB6 OCR SDK Example and tried to mimic what I saw with it as closely as I could in PowerBuilder. I am able to initialize the OCR and load a PDF, but as soon as I try to "OCR_MakeSearchable" I am returned an "unspecified internal error" according to the error codes in the help manual.

I am attaching a sample application that I have compiled along with required runtime DLL's needed for PowerBuilder application deployment. The attachment contains everything except for ocrtools.dll and the OCRLanguages folder which would have made the upload too large. Here is the code that runs in the application:

Source & Binary Of Sample
OCR.7z
PXO_Options

Code: Select all

type pxo_options from structure
	long		lang
	long		regionmode
	string		whitelist
	string		blacklist
	string		datapath
	long		imageflags
	long		raster_dpi
	long		accmode
end type
PowerBuilder doesn't allow user defined enumerations, so I had to pass LONGs instead for "RegionMode" and "ImageProcessingFlags". I experimented with the datatype in case LONG was wrong. Changing the datatype to INTEGER crashed it :( . LONG is the only datatype that seems to work properly when passing PXO_Options to the OCR_MakeSearchable method.


Main Code

Code: Select all

long			 	ll_document
long				ll_result
long				ll_pagelist
PXO_Options		lstr_options
string				ls_key
string				ls_code

SetNull(ls_key)
SetNull(ls_code)
SetNull(ll_pagelist)

lstr_options.blackList = ""
lstr_options.whiteList = ""
lstr_options.ImageFlags = 1 // rotate up to 45 degrees for OCRing
lstr_options.lang = 0 // english
lstr_options.raster_dpi = 300 // dots per inch
lstr_options.RegionMode = 1 // OCR_Auto
lstr_options.DataPath = "OCRLanguages\"  // ocr languages dir
lstr_options.accMode = 0 // not used, supposed to be 0


ll_result = OCR_Init(ll_document,ls_key,ls_code)
messagebox("OCR Init","result of OCR_Init: " + string(ll_result))

ll_result = OCR_LoadA(ll_document,"sample_pages.pdf")
messagebox("Load PDF","result of OCR_LoadA: " + string(ll_result))

ll_result = OCR_MakeSearchable(ll_document, lstr_options, ll_pagelist)
if ll_result <> 0 then
	
	choose case ll_result
	case -2113263849
		messagebox("OCR Failure","Unspecified internal error")
	case -2113732591 // according to DSErrorLookup this is an error in the images core
		messagebox("OCR Failure","Internal error in images core")
	case else
		messagebox("OCR Failure","Error code: " + string(ll_result))
	end choose
	
	halt

end if

ll_result = OCR_SaveA(ll_document,"output.pdf")

OCR_Delete(ll_document)


Can anyone suggest what might be going wrong here?
Thanks in advance for any help,
-John
You do not have the required permissions to view the files attached to this post.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

The first thing I would check would be the data directory. You have specified a relative path and while there's nothing wrong with that in theory, I would recommend that you specify the full complete path (e.g. "z:\ocr_project\build\OCRLanguages\") just to eliminate this as a possibility. I'm also not sure how powerbuilder handles string literals but make sure the backslash is not being interpreted as an escape character (ie, that you don't need to specify "z:\\ocrproject\\build\\....").

Please let us know how this works and if it doesn't help we can start looking for other possibilities, but that would be the very first thing I would try. Also make sure that the appropriate language file is installed in "OCRLanguages\ocrdats" - ie "OCRLanguages\ocrdats\eng_pxvocr.dat".

Also, what version of the dll are you using? Right-click it in windows explorer and select properties to view.
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Walter,

Thanks for your reply.

I have tried fiddling with the data directory. I put the OCRLanguages in C:\ and changed code to say lstr_options.DataPath = "C:\OCRLanguages\" so that it is an absolute rather than relative path and I still encounter the same issue. I have also tried fiddling with the slashes in the path, switching them to forward slashes, removing the trailing slash, etc., and I still get "Unspecified internal error". I have ensured that the appropriate language is installed as expected in the ocrdats subfolder.

I'm using version 1.0.8.0 of the ocrtools.dll.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

Can you attach the sample PDF or email it privataly to [email protected]? If it is a document you cannot share, can you try a different document to see if the problem persists?

Also you may wish to download the latest version of the DLL (1.0.9.0)

-Walter
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

I'm using the sample_pages.pdf that I pulled out of the examples in the PDF XChange PRO 4 install folder:

PDF-XChange PRO 4 SDK\Examples\OcrSDKExamples\CExamples\test-input\sample_pages.pdf

I bundled the sample_pages.pdf with the code sample that I attached in an earlier post. I also downloaded version 1.0.9.0 of the DLL and still get the same error. I have tried other documents to see if that's the problem but still the same error. I would expect it to work with the file you guy include in your examples, and seems to work fine when I run the example binaries.

I'm at a loss. If I can't get past this problem, I may have to go back to the drawing board and I really don't want to! :(
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

The second argument to OCR_MakeSearchable() should be a pointer to the PXO_Options structure - maybe that is the issue here? It looks like you are passing by value, though I would expect a compilation error in that case. I'm not at all familiar with PowerBuilder.
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Yes, I've declared the OCR_MakeSearchable method to have PXO_Options as a reference parameter:

FUNCTION long OCR_MakeSearchable(long document, ref PXO_Options options, long pageList) LIBRARY "ocrtools.dll" ALIAS FOR "OCR_MakeSearchable;ansi"

The parameter in the method signature determines how the value is passed into the method, I cannot declare a pointer in my main code, PowerBuilder doesn't work that way. PowerBuilder is similar to VisualBasic with how parameters are setup and passed into the external methods, which is why I was following the VB example as closely as possible.

Here's the VB code I was trying to copy:

Code: Select all

    Dim res As Long
    Dim doc As Long

    Call OCR_Init(doc, "", "")
    
    res = OCR_LoadA(doc, tbInput.Text)
    If IS_DS_FAILED(res) Then
        MsgBox "Failed to load pdf file"
        Exit Sub
    End If
    
    SetCallback doc
    
    Dim options As PXO_Options
    options.blackList = ""
    options.whiteList = ""
    options.ImageFlags = OCR_Image_Autorotate
    options.lang = PXO_English
    options.raster_dpi = 300
    options.RegionMode = OCR_Auto
    options.DataPath = StrConv(tbData.Text, vbUnicode)
    options.accMode = 0
    
    res = OCR_MakeSearchable(doc, options, 0)
    
    If IS_DS_FAILED(res) Then
        MsgBox "Make searchable failed " & Str(res)
    Else
        OCR_SaveA doc, tbOutput.Text
    End If
    
    OCR_Delete doc
One thing I'm wondering about is what the "SetCallback doc" command is doing? That's one thing I don't do in my code.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

The callback function just lets you pass a pointer to a function that will be executed at various stages during processing to provide feedback. It's not necessary for the basic functionality although you won't be able to provide, e.g., an accurate job status indicator.

I really can't imagine what is going wrong here but I will look at your code again to see if there is anything I've missed. You might try omitting the pagelist argument as it is optional (has a default NULL value).

You are absolutely positive that you have the following directory and file?

C:\OCRLanguages\ocrdats\eng_pxvocr.dat?

By far the most common cause of "OCR_ERR_INTERNAL" is a missing or incorrectly specified language directory or file. We should probably return a more specific error code for this case to make it easier to diagnose, I realize...

Perhaps you can also try setting blacklist and whitelist to NULL rather than an empty string? This is supposed to be a special Microsoft type (BSTR) that is not quite a normal null-terminated string, in that it is prefixed with a string length indicator. It is still a pointer, like a normal string, but some internal functions expect the BSTR string length data to be properly arranged in memory. So I'm not sure how it would behave with a string passed in this fashion, but setting it to NULL is acceptable. This may also cause the data path to be incorrectly specified.
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

I have transcribed your code to the C++ equivalents and tested it and had no problems whatsoever (using the sample_pages.pdf you attached, not our copy) so I'm at a loss, other than the suggestions I have made.

I would think it most likely has something to do with the language file / directory not being found properly, but I'm at a loss for why that might be in this case, except for maybe an issue with converting from string literals in your code to the expected BSTR type or a mistake in the path you've placed it in.

The last possibility might be a corrupt eng_pxvocr.dat perhaps; you can download it from our website to make sure, or try one of the other default languages distributed with the SDK (French, German, Spanish) (will still work on the sample document though may make more mistakes).

https://www.pdf-xchange.com/pdfxocr ... nguage_ext



-Walter
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

I've double-checked to ensure that the file and directory exists like you've asked:
C:\OCRLanguages\ocrdats\eng_pxvocr.dat

It's definitely there. I've also tried making the blacklist/whitelist parameters null to see if that changes anything and it does not.

I can definitely agree that it would be handy to have a more specific error code in the circumstance that it cannot find the specified language or directory, rather than returning "unspecified error" because it's not clear whether that's the problem or if some other "unspecified" issue has occurred.

I appreciate your continued help with this. I will investigate the BSTR type that you mentioned to see if something perhaps isn't lining up in that instance. Please let me know if you have any other suggestions and I will continue to hammer on this until it either works or simply won't work. :?

Thanks,
John
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

I wish I could think of something else to suggest but as far as I can see you're doing everything right, unless there's some PowerBuilder idiosyncracy that I cannot recognize. I do not suspect it is a bug on our side due to the input you are providing (which matches well-tested input).

I'd not give up on the language angle. OCR_ERR_INTERNAL is returned only in a couple of types of cases.

The first are errors with initialization (in practice, usually a language file or directory problem). I agree we should have a separate error code for this type, but for now, it is what it is.

The second are unpredictable errors in recognition itself (which would tend to come from garbled or bizarre input). In this case you're using a well-tested sample input with simple structure and nothing confusing in it, so this is extremely unlikely.

This leaves an initialization error of some kind related to the input you are passing, and most likely language and path, but perhaps also whitelist/blacklist or even the options structure as a whole. Please do check the file to ensure it is not truncated or corrupted somehow. It should be ~3040 kB (eng_pxvocr.dat). Download a new one from the link provided a couple of posts up, or try one of the other languages, to be sure.
aitchisj
User
Posts: 47
Joined: Mon Apr 04, 2011 4:44 am

Re: Zonal OCR and Other Questions To Get Me Started

Post by aitchisj »

Great News! This appears to be my problem:

Code: Select all

FUNCTION long OCR_MakeSearchable(long document, ref PXO_Options options, long pageList) LIBRARY "ocrtools.dll" ALIAS FOR "OCR_MakeSearchable;ansi"
The ;ansi on the end of my external function declaration caused the issue. I've gotten into the habit of declaring functions that way because lots of times that's how you do it; however, in this case it's wrong. According to the PowerBuilder help manual, ";ansi is required if the function passes a string as an argument or returns a string that uses ANSI encoding." Remove it and it makes the document OCR Searchable as expected. :D I am investigating my other function declarations to see if they might have similar problems, although changing this one seemed to do the trick.

Phew! Past this hurdle, hopefully the rest goes smoother!
Thanks a lot for you help and suggestions.
-John 8)
Walter-Tracker Supp
User
Posts: 381
Joined: Mon Jun 13, 2011 5:10 pm

Re: Zonal OCR and Other Questions To Get Me Started

Post by Walter-Tracker Supp »

Ah, great news. I would not have identified that from your code although in retrospect it makes sense. Most strings input and output from functions are UTF-16 unicode (MS VC++ type LPWSTR or BSTR). So what probably happened is that the ANSI string you passed for the language directory was interpreted as unicode (whatever garbage that ends up being) and the attempt to initialize the language from that nonsense directory name then failed.

Glad you sorted it out.

-Walter