Get text from region?

PDF-XChange Viewer SDK for Developer's
(ActiveX and Simple DLL Versions)

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

cew
User
Posts: 213
Joined: Tue Feb 01, 2011 8:14 am

Get text from region?

Post by cew »

Hi,

I have a document with multiple pages that I want to split in single documents for each page.
Now I know that a header is always at the same location on each page. This header I want to use as the new documents filename.


Is it possible to get the text (in my case the header) for a specified region?

Best regards
cew
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19913
Joined: Mon Jan 12, 2009 8:07 am

Re: Get text from region?

Post by Stefan - PDF-XChange »

Hello cew,

I am afraid that your options to extract text with the Viewer are limited - But please have a look at the Get method for text:
In Pseudocode:

Code: Select all

...
// Gets first 10 characters on the page:
DoVerb("Documents[0].Pages[0].Text", "Get", DataIn(0, 10), DataOut, 0);
...
Which should work for your needs.

Best,
Stefan
cew
User
Posts: 213
Joined: Tue Feb 01, 2011 8:14 am

Re: Get text from region?

Post by cew »

Sorry, thats not what I need.

I don't know how much characters are in front of the search text and how long it will be.
All I know is the geometric position where it is located.

I need something like this:

Code: Select all

DoVerb("Documents[0].Pages[0].Text", "GetTextInRegion", DataIn(100, 100, 200, 120), DataOut, 0);
where DataIn(100, 100, 200, 120) are the upper left and lower right coordinates of the regions rectangle in pixel or something similar.

Best
cew
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3586
Joined: Thu Jul 08, 2004 10:36 pm

Re: Get text from region?

Post by Ivan - Tracker Software »

I'm afraid there is no simple way to get text from the region, but you can try to use JavaScript to do the job:

Code: Select all

function TextFromRegion(nPage, left, top, right, bottom)
{
  var nWordsCount = this.getPageNumWords(nPage);
  var sText = "";
  for (var i = 0; i < nWordsCount; i++)
  {
    var q = this.getPageNthWordQuads(nPage, i);
    var numQuads = q.length;
    var wordInRect = false;
    for (j = 0; j < numQuads; j++)
    {
      var a = q[j];
      if ((a[0] >= left) && (a[0] <= right) &&
          (a[2] >= left) && (a[2] <= right) &&
          (a[4] >= left) && (a[4] <= right) &&
          (a[6] >= left) && (a[6] <= right) &&
          (a[1] >= bottom) && (a[1] <= top) &&
          (a[3] >= bottom) && (a[3] <= top) &&
          (a[5] >= bottom) && (a[5] <= top) &&
          (a[7] >= bottom) && (a[7] <= top))
      {
        wordInRect = true;
        break;
      }
    }
    if (wordInRect)
    {
      if (sText != "") sText += " ";
      sText += this.getPageNthWord(nPage, i);
    }
  }
  return sText;
}
PDF-XChange Co Ltd. (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
cew
User
Posts: 213
Joined: Tue Feb 01, 2011 8:14 am

Re: Get text from region?

Post by cew »

Great, thank you!

Best
cew
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19913
Joined: Mon Jan 12, 2009 8:07 am

Re: Get text from region?

Post by Stefan - PDF-XChange »

Glad we could help cew!

Best,
Stefan