How can I extract text from attachments?
Best,
cew
Textextraction for attachments?
Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange
-
- Site Admin
- Posts: 19913
- Joined: Mon Jan 12, 2009 8:07 am
Re: Textextraction for attachments?
Hi cew,
Do you have in mind pdf attachments like e.g. in a portfolio?
It's not possible directly, so you will need to open the attachment first - and then perform the text extraction in the separately loaded file.
Best,
Stefan
Do you have in mind pdf attachments like e.g. in a portfolio?
It's not possible directly, so you will need to open the attachment first - and then perform the text extraction in the separately loaded file.
Best,
Stefan
-
- User
- Posts: 213
- Joined: Tue Feb 01, 2011 8:14 am
Re: Textextraction for attachments?
How did you know that?Tracker Supp-Stefan wrote: Do you have in mind pdf attachments like e.g. in a portfolio?
Ok, you just read my post about portfolios

Do you have a code snippet on how to do that?Tracker Supp-Stefan wrote: It's not possible directly, so you will need to open the attachment first - and then perform the text extraction in the separately loaded file.
Best,
cew
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Textextraction for attachments?
To extract text you will need to use the PDF Tools SDK or ActiveX Viewer.
See section "Text Extraction" in the PDF Tools SDK manual.
In the ActiveX viewer, use the methods "GetAllText" and "GetAllSelectedText". See Section 3.7 of the ActiveX manual titled "How to Extract Text from Document?"
e.g.
See section "Text Extraction" in the PDF Tools SDK manual.
In the ActiveX viewer, use the methods "GetAllText" and "GetAllSelectedText". See Section 3.7 of the ActiveX manual titled "How to Extract Text from Document?"
e.g.
Code: Select all
// Gets all text to string variable (DataOut):
DoVerb("Documents[0]", "GetAllText", NULL, DataOut, 0);
// Gets all text to specified file:
DoVerb("Documents[0]", "GetAllText", "C:\PdfText.txt", NULL, 0);
// Gets all text to stream object which contained in DataIn:
DoVerb("Documents[0]", "GetAllText", DataIn, NULL, 0);