Text Location and Contents

PDF-XChange Viewer SDK for Developer's
(ActiveX and Simple DLL Versions)

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

tdonovan
User
Posts: 61
Joined: Tue Jun 12, 2007 9:21 pm

Text Location and Contents

Post by tdonovan »

I have been working with the viewer to determine what text is within an area of the screen and what the location of that text is. I have it working but I am not 100% confident I have dealt with all of the issues. Rotation of the page, rotation of the text all seem to impact the items.

It seems that your highlight feature has to deal with all of these issues. Is there a sample that would show what it does, or is there routines that it uses that we can use.

Thanks
User avatar
Vasyl - PDF-XChange
Site Admin
Posts: 2445
Joined: Thu Jun 30, 2005 4:11 pm

Re: Text Location and Contents

Post by Vasyl - PDF-XChange »

Hi, Tom.
I have been working with the viewer to determine what text is within an area of the screen and what the location of that text is. I have it working but I am not 100% confident I have dealt with all of the issues. Rotation of the page, rotation of the text all seem to impact the items.
You may obtain quad (four points on the pdf-page, parallelogram, relative to page's MediaBox) for each character on the page.
Look to [Reference/NamedItems/Objects/Documents/Item/Pages/Item/Text/Chars...]:

Code: Select all

ctrl.GetDocumentProperty(docId, "Pages[<pageIndex>].Text.Chars[<charIndex>].Quad.Value", out dataOut, 0); // dataOut will contain 8 numbers
or you may obtain composite quads for characters range on the page:

Code: Select all

ctrl.DoDocumentVerb(docId, "Pages[<pageIndex>].Text", "GetQuads", dataIn(<Start>, <Len>), out dataOut, 0); // dataOut will contain some quads
The quad does not include the page rotation, document view rotation, etc. You may use directly these coordinates for creating new annotations for example.
Currently exist one way to obtain coordinates with included actual rotation, scaling, offset (in points, not in pixels):

Code: Select all

string js = 
"m = (new Matrix2D).fromRotated(this, 0);
a = new Array;
a[0] = <pageX>;
a[1] = <pageY>;
... // other points
b = m.transform(a);
b;"
// <pageX>, <pageY> - coordinates of point on the page, relative to page's media box
string res;
ctrl.RunJavaScript(js, out res, ...); // res will contain the new coordinates separated by comma
HTH
PDF-XChange Co. LTD (Project Developer)

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
tdonovan
User
Posts: 61
Joined: Tue Jun 12, 2007 9:21 pm

Re: Text Location and Contents

Post by tdonovan »

THe reference to it's relataive to the media box helped a great deal. Thank you.
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19885
Joined: Mon Jan 12, 2009 8:07 am

Re: Text Location and Contents

Post by Stefan - PDF-XChange »

:)