Hi, I'm trying to get PDF words/chars information using PDF-XChange viewer ActiveX control (demo version) in VB.NET:
In order to get words data I'm using the following code:
AxCoPDFXCview1.OpenDocument("") ' open a document ....
Dim vDataOut as Object = Nothing
AxCoPDFXCview1.DoVerb("Documents[0].Pages[0].Text.Words[0].Count", "Get", Nothing, vDataOut)
The above code works fine for Count, String, Offset and Length properties, but when I try to get Quads information I always get Nothing as returned value. Am I doing it the right way?
The same I got for chars.
I attempted this code as well (for chars), and it always returns 0:
Dim spPageChars As PDFXCviewAxLib.IPDFXCsmartp = Nothing
Dim spChar As PDFXCviewAxLib.IPDFXCsmartp = Nothing
AxCoPDFXCview1.DoVerb("Documents[0].Pages[0].Text.Chars", ".SP", Nothing, spPageChars)
spPageChars.GetItemPointByIndex(0, spChar)
Dim vDataOut As Object = Nothing
spChar.GetProperty("Quad", vDataOut, 0)
The same occurs for every page and every word/char in the document.
Can you please give me some help on this? Is there something wrong in my code?
Thank you a lot.
Fabrizio
problem getting chars Quad in .NET
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
- User
- Posts: 33
- Joined: Fri Jul 02, 2010 1:58 pm
Re: problem getting chars Quad in .NET
Sorry, the line above clearly was:fabrizio wrote:In order to get words data I'm using the following code:
AxCoPDFXCview1.OpenDocument("") ' open a document ....
Dim vDataOut as Object = Nothing
AxCoPDFXCview1.DoVerb("Documents[0].Pages[0].Text.Words[0].Count", "Get", Nothing, vDataOut)
AxCoPDFXCview1.DoVerb("Documents[0].Pages[0].Text.Words.Count", "Get", Nothing, vDataOut)
-
- User
- Posts: 664
- Joined: Tue Nov 14, 2006 12:23 pm
Re: problem getting chars Quad in .NET
Hi Fabrizio,
According to documentation, please try to get "Quad.value".
HTH.
According to documentation, please try to get "Quad.value".
HTH.
-
- User
- Posts: 61
- Joined: Tue Jun 12, 2007 9:21 pm
Re: problem getting chars Quad in .NET
Here is a function I use that I had help from Tracker on getting working and hopefully should help you for your own routine. PDFXView is name I gave the activex control on the form this routine is in.
Private Function GetTextInRectangle(ByVal sRect() As String) As String
Dim sb As New StringBuilder
Dim rect As New Rectangle(CInt(sRect(0)), CInt(sRect(1)), CInt(CDbl(sRect(2)) - CDbl(sRect(0))), CInt(CDbl(sRect(3)) - CDbl(sRect(1))))
Dim nActiveDocID As Integer
PDFXView.GetActiveDocument(nActiveDocID)
If nActiveDocID < 0 Then
MessageBox.Show("No open document")
Return Nothing
End If
Dim spWords As PDFXCviewAxLib.IPDFXCsmartp
Dim vDataOut As Object = Nothing
Dim vDataIn As Object = Nothing
Dim curPage As Integer = CurrentPage()
PDFXView.DoVerb("Documents[#" & nActiveDocID & "].Pages[" & curPage & "].Text.Words", ".SP", vDataIn, vDataOut, 0)
spWords = CType(vDataOut, PDFXCviewAxLib.IPDFXCsmartp)
vDataOut = Nothing
spWords.GetProperty("Count", vDataOut, 0)
If TypeOf vDataOut Is Integer Then
Dim iWords As Integer = CInt(vDataOut)
For i As Integer = 0 To iWords - 1
Dim spWord As PDFXCviewAxLib.IPDFXCsmartp = Nothing
spWords.GetItemPointByIndex(i, spWord)
spWord.GetProperty("Quads.Value", vDataOut, 0)
Dim quad As Double() = CType(vDataOut, Double())
Dim wordRect As New Rectangle(CInt(quad(0)), CInt(quad(1)), CInt(quad(2) - quad(0)), CInt(quad(5) - quad(1)))
If wordRect.IntersectsWith(rect) Then
spWord.GetProperty("String", vDataOut, 0)
sb.Append(vDataOut.ToString)
End If
System.Runtime.InteropServices.Marshal.ReleaseComObject(spWord)
Next
End If
System.Runtime.InteropServices.Marshal.ReleaseComObject(spWords)
Console.WriteLine(sb.ToString)
Return sb.ToString
End Function
Private Function GetTextInRectangle(ByVal sRect() As String) As String
Dim sb As New StringBuilder
Dim rect As New Rectangle(CInt(sRect(0)), CInt(sRect(1)), CInt(CDbl(sRect(2)) - CDbl(sRect(0))), CInt(CDbl(sRect(3)) - CDbl(sRect(1))))
Dim nActiveDocID As Integer
PDFXView.GetActiveDocument(nActiveDocID)
If nActiveDocID < 0 Then
MessageBox.Show("No open document")
Return Nothing
End If
Dim spWords As PDFXCviewAxLib.IPDFXCsmartp
Dim vDataOut As Object = Nothing
Dim vDataIn As Object = Nothing
Dim curPage As Integer = CurrentPage()
PDFXView.DoVerb("Documents[#" & nActiveDocID & "].Pages[" & curPage & "].Text.Words", ".SP", vDataIn, vDataOut, 0)
spWords = CType(vDataOut, PDFXCviewAxLib.IPDFXCsmartp)
vDataOut = Nothing
spWords.GetProperty("Count", vDataOut, 0)
If TypeOf vDataOut Is Integer Then
Dim iWords As Integer = CInt(vDataOut)
For i As Integer = 0 To iWords - 1
Dim spWord As PDFXCviewAxLib.IPDFXCsmartp = Nothing
spWords.GetItemPointByIndex(i, spWord)
spWord.GetProperty("Quads.Value", vDataOut, 0)
Dim quad As Double() = CType(vDataOut, Double())
Dim wordRect As New Rectangle(CInt(quad(0)), CInt(quad(1)), CInt(quad(2) - quad(0)), CInt(quad(5) - quad(1)))
If wordRect.IntersectsWith(rect) Then
spWord.GetProperty("String", vDataOut, 0)
sb.Append(vDataOut.ToString)
End If
System.Runtime.InteropServices.Marshal.ReleaseComObject(spWord)
Next
End If
System.Runtime.InteropServices.Marshal.ReleaseComObject(spWords)
Console.WriteLine(sb.ToString)
Return sb.ToString
End Function
-
- User
- Posts: 33
- Joined: Fri Jul 02, 2010 1:58 pm
Re: problem getting chars Quad in .NET
Thank you for the help and for the code. Using Quad.value was the right way to return quad coordinates.
Thanks, Fabrizio
Thanks, Fabrizio
-
- Site Admin
- Posts: 19913
- Joined: Mon Jan 12, 2009 8:07 am