Hi,
I have used PDFXchange and tesseract for OCR in my project. And it works fine. My only concern is that the font size of the OCRd text which is added in the text layer of the pdf file is too small. When I do Ctrl+F and search for a key word, it highlights the text. The highlight is so small that it appears like a dot which is easily missed. Shouldn't the font size be automatcally calculated? It works as expected in your end user control. see the attached image for more clarity.
Here is the code-
Dim Op As PDFXEdit.IOperation = Inst1.CreateOp(nID)
Dim input As PDFXEdit.ICabNode = Op.Params.Root("Input")
Dim fsInst As PDFXEdit.IAFS_Inst = CType(Inst1.GetExtension("AFS"), PDFXEdit.IAFS_Inst)
Dim impPath As PDFXEdit.IAFS_Name = fsInst.DefaultFileSys.StringToName(OpenFileDialog1.FileName)
Dim stroutputpath As String = System.IO.Path.GetDirectoryName(OpenFileDialog1.FileName) & "\" & System.IO.Path.GetFileNameWithoutExtension(OpenFileDialog1.FileName) & DateTime.Now.ToString("MMddyyymmss") & ".pdf"
Dim fsInst1 As PDFXEdit.IAFS_Inst = CType(Inst1.GetExtension("AFS"), PDFXEdit.IAFS_Inst)
Dim outPath As PDFXEdit.IAFS_Name = fsInst1.DefaultFileSys.StringToName(stroutputpath)
Dim pxcInst As PDFXEdit.IPXC_Inst = CType(Inst1.GetExtension("PXC"), PDFXEdit.IPXC_Inst)
Dim resDoc As PDFXEdit.IPXC_Document = pxcInst.OpenDocumentFrom(impPath, Nothing)
input.v = resDoc
Dim options As PDFXEdit.ICabNode = Op.Params.Root("Options")
options("OutputType").v = 0
options("OCRNoTextPagesOnly").v = True
Try
Op.Do()
Catch ex As Exception
resDoc.Close()
Exit Sub
End Try
resDoc.WriteTo(outPath)
resDoc.Close()
Pls guide.
Size of OCRd text
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
Forum rules
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
-
charuvasudev
- User
- Posts: 16
- Joined: Wed Feb 11, 2009 5:48 am
Size of OCRd text
You do not have the required permissions to view the files attached to this post.
-
charuvasudev
- User
- Posts: 16
- Joined: Wed Feb 11, 2009 5:48 am
Re: Size of OCRd text
Hi,
Any update on my issue?
Any update on my issue?
-
Daniel - PDF-XChange
- Site Admin
- Posts: 12608
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Size of OCRd text
Hello, charuvasudev
You posted this during the holidays, we have a bit of a backlog, but the Dev team is working through things. I cannot promise when they will be able to respond, but we will be back to you once they have taken a look.
Kind regards,
You posted this during the holidays, we have a bit of a backlog, but the Dev team is working through things. I cannot promise when they will be able to respond, but we will be back to you once they have taken a look.
Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
Vasyl - PDF-XChange
- Site Admin
- Posts: 2476
- Joined: Thu Jun 30, 2005 4:11 pm
Re: Size of OCRd text
Hi, charuvasudev.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
PDF-XChange Co. LTD (Project Developer)
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
-
Anushka
- User
- Posts: 13
- Joined: Thu Nov 27, 2025 7:33 am
Re: Size of OCRd text
Hello Support Team, As per the above quoted suggestion i had updated the newer operation "op.document.OCRPages2" and parameter "SkipPagesWithText" it was working fine but now suddenly this operation gives me exception when I do Op.Do(). I tried moving back to the older operation "op.document.OCRPages" and parameter "OCRNoTextPagesOnly" the execption is not there but the OCR is not properly done as after the operation I don't get any text on searching and trying to select the text.Vasyl - PDF-XChange wrote: ↑Thu Jan 08, 2026 10:35 pm Hi, charuvasudev.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
Below is the code i am trying for reference:
Code: Select all
Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages", False)
'Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages2", False)
Dim Op As PDFXEdit.IOperation = pdfctrl.Inst.CreateOp(nID)
Dim input As PDFXEdit.ICabNode = Op.Params.Root("Input")
Dim clbk As AuthCallback = New AuthCallback()
Dim doc As PDFXEdit.IPXV_Document = pdfctrl.Doc
input.v = doc
Dim options As PDFXEdit.ICabNode = Op.Params.Root("Options")
options("OutputType").v = 0
options("OCRNoTextPagesOnly").v = False
'options("SkipPagesWithText").v = FalseThank You.
-
Anushka
- User
- Posts: 13
- Joined: Thu Nov 27, 2025 7:33 am
Re: Size of OCRd text
Hello Support Team,Anushka wrote: ↑Tue Mar 31, 2026 12:04 pmHello Support Team, As per the above quoted suggestion i had updated the newer operation "op.document.OCRPages2" and parameter "SkipPagesWithText" it was working fine but now suddenly this operation gives me exception when I do Op.Do(). I tried moving back to the older operation "op.document.OCRPages" and parameter "OCRNoTextPagesOnly" the execption is not there but the OCR is not properly done as after the operation I don't get any text on searching and trying to select the text.Vasyl - PDF-XChange wrote: ↑Thu Jan 08, 2026 10:35 pm Hi, charuvasudev.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
Below is the code i am trying for reference:Kindly help me identify what can be wrong here.Code: Select all
Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages", False) 'Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages2", False) Dim Op As PDFXEdit.IOperation = pdfctrl.Inst.CreateOp(nID) Dim input As PDFXEdit.ICabNode = Op.Params.Root("Input") Dim clbk As AuthCallback = New AuthCallback() Dim doc As PDFXEdit.IPXV_Document = pdfctrl.Doc input.v = doc Dim options As PDFXEdit.ICabNode = Op.Params.Root("Options") options("OutputType").v = 0 options("OCRNoTextPagesOnly").v = False 'options("SkipPagesWithText").v = False
Thank You.
Related to the above query I wanted to add that the new method "op.document.OCRPages2" and parameter "SkipPagesWithText" gives me an exception stating "Error HRESULT E_FAIL has been returned from a call to a COM component." I tried this method even with the latest version of SDK (10.8.4.409) yet there is exception. Also where can I get the latest dlls of Plugins?
Thank You.
-
Anushka
- User
- Posts: 13
- Joined: Thu Nov 27, 2025 7:33 am
Re: Size of OCRd text
Hello any update on this?Anushka wrote: ↑Wed Apr 01, 2026 6:58 amHello Support Team,Anushka wrote: ↑Tue Mar 31, 2026 12:04 pmHello Support Team, As per the above quoted suggestion i had updated the newer operation "op.document.OCRPages2" and parameter "SkipPagesWithText" it was working fine but now suddenly this operation gives me exception when I do Op.Do(). I tried moving back to the older operation "op.document.OCRPages" and parameter "OCRNoTextPagesOnly" the execption is not there but the OCR is not properly done as after the operation I don't get any text on searching and trying to select the text.Vasyl - PDF-XChange wrote: ↑Thu Jan 08, 2026 10:35 pm Hi, charuvasudev.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
Below is the code i am trying for reference:Kindly help me identify what can be wrong here.Code: Select all
Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages", False) 'Dim nID As Integer = pdfctrl.Inst.Str2ID("op.document.OCRPages2", False) Dim Op As PDFXEdit.IOperation = pdfctrl.Inst.CreateOp(nID) Dim input As PDFXEdit.ICabNode = Op.Params.Root("Input") Dim clbk As AuthCallback = New AuthCallback() Dim doc As PDFXEdit.IPXV_Document = pdfctrl.Doc input.v = doc Dim options As PDFXEdit.ICabNode = Op.Params.Root("Options") options("OutputType").v = 0 options("OCRNoTextPagesOnly").v = False 'options("SkipPagesWithText").v = False
Thank You.
Related to the above query I wanted to add that the new method "op.document.OCRPages2" and parameter "SkipPagesWithText" gives me an exception stating "Error HRESULT E_FAIL has been returned from a call to a COM component." I tried this method even with the latest version of SDK (10.8.4.409) yet there is exception. Also where can I get the latest dlls of Plugins?
Thank You.
-
Daniel - PDF-XChange
- Site Admin
- Posts: 12608
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Size of OCRd text
Hello, Anushka
I will have to ask for some patience, The Dev team was informed when your first post went up, but we have a very small team and they are quite busy at the moment. When they are able to reply, they will come back here.
Kind regards,
I will have to ask for some patience, The Dev team was informed when your first post went up, but we have a very small team and they are quite busy at the moment. When they are able to reply, they will come back here.
Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com