Hi,
I have used PDFXchange and tesseract for OCR in my project. And it works fine. My only concern is that the font size of the OCRd text which is added in the text layer of the pdf file is too small. When I do Ctrl+F and search for a key word, it highlights the text. The highlight is so small that it appears like a dot which is easily missed. Shouldn't the font size be automatcally calculated? It works as expected in your end user control. see the attached image for more clarity.
Here is the code-
Dim Op As PDFXEdit.IOperation = Inst1.CreateOp(nID)
Dim input As PDFXEdit.ICabNode = Op.Params.Root("Input")
Dim fsInst As PDFXEdit.IAFS_Inst = CType(Inst1.GetExtension("AFS"), PDFXEdit.IAFS_Inst)
Dim impPath As PDFXEdit.IAFS_Name = fsInst.DefaultFileSys.StringToName(OpenFileDialog1.FileName)
Dim stroutputpath As String = System.IO.Path.GetDirectoryName(OpenFileDialog1.FileName) & "\" & System.IO.Path.GetFileNameWithoutExtension(OpenFileDialog1.FileName) & DateTime.Now.ToString("MMddyyymmss") & ".pdf"
Dim fsInst1 As PDFXEdit.IAFS_Inst = CType(Inst1.GetExtension("AFS"), PDFXEdit.IAFS_Inst)
Dim outPath As PDFXEdit.IAFS_Name = fsInst1.DefaultFileSys.StringToName(stroutputpath)
Dim pxcInst As PDFXEdit.IPXC_Inst = CType(Inst1.GetExtension("PXC"), PDFXEdit.IPXC_Inst)
Dim resDoc As PDFXEdit.IPXC_Document = pxcInst.OpenDocumentFrom(impPath, Nothing)
input.v = resDoc
Dim options As PDFXEdit.ICabNode = Op.Params.Root("Options")
options("OutputType").v = 0
options("OCRNoTextPagesOnly").v = True
Try
Op.Do()
Catch ex As Exception
resDoc.Close()
Exit Sub
End Try
resDoc.WriteTo(outPath)
resDoc.Close()
Pls guide.
Size of OCRd text
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
Forum rules
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
-
charuvasudev
- User
- Posts: 16
- Joined: Wed Feb 11, 2009 5:48 am
Size of OCRd text
You do not have the required permissions to view the files attached to this post.
-
charuvasudev
- User
- Posts: 16
- Joined: Wed Feb 11, 2009 5:48 am
Re: Size of OCRd text
Hi,
Any update on my issue?
Any update on my issue?
-
Daniel - PDF-XChange
- Site Admin
- Posts: 12516
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Size of OCRd text
Hello, charuvasudev
You posted this during the holidays, we have a bit of a backlog, but the Dev team is working through things. I cannot promise when they will be able to respond, but we will be back to you once they have taken a look.
Kind regards,
You posted this during the holidays, we have a bit of a backlog, but the Dev team is working through things. I cannot promise when they will be able to respond, but we will be back to you once they have taken a look.
Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
PDF-XChange Co. LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
-
Vasyl - PDF-XChange
- Site Admin
- Posts: 2476
- Joined: Thu Jun 30, 2005 4:11 pm
Re: Size of OCRd text
Hi, charuvasudev.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
Maybe the difference is because the Editor EU uses the newer "op.document.OCRPages2" operation internally, instead of the much older and now-obsolete "op.document.OCRPages", which you use. I recommend starting to use the new version with the "SkipPagesWithText" parameter, as equivalent of "OCRNoTextPagesOnly" from the older version.
HTH.
PDF-XChange Co. LTD (Project Developer)
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.