Text in created Bookmark all greek to me in some files

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

p.kazazis
User
Posts: 10
Joined: Mon May 30, 2011 11:15 am

Text in created Bookmark all greek to me in some files

Post by p.kazazis »

Hello, and have a nice day.

I was wondering is there a way a user can do something when in some pdf files selecting text and Pressing CTRL+SHIFT+B makes a new bookmark but the text inside has nothing with what you selected.

Resembles to the web pages when they load and the encoding is wrong and you have to change the encoding to see text right.

Thanks in advance.
User avatar
Paul - PDF-XChange
Site Admin
Posts: 7443
Joined: Wed Mar 25, 2009 10:37 pm

Re: Text in created Bookmark all greek to me in some files

Post by Paul - PDF-XChange »

HI p.kazazis,

that does sound like an issue with the font not being used correctly. We can't say much without seeing the file. Can you zip up and attach a sample? It must be zipped or the forum software will block it.

thanks
Best regards

Paul O'Rorke
PDF-XChange Support
http://www.pdf-xchange.com
p.kazazis
User
Posts: 10
Joined: Mon May 30, 2011 11:15 am

Re: Text in created Bookmark all greek to me in some files

Post by p.kazazis »

The following attachment is a sample pdf file that does that.
2002 Ν.3028 ΦΕΚ Α 153 Προστασία Αρχαιοτήτων + εν γένει Πολιτιστικής Κληρονομιάς.zip
The language in the file is greek.

The Bookmark name that gets created by the CTRL+SHIFT+B key combination after selecting the text is totally different.
You do not have the required permissions to view the files attached to this post.
User avatar
Lzcat - Tracker Supp
Site Admin
Posts: 677
Joined: Thu Jun 28, 2007 8:42 am

Re: Text in created Bookmark all greek to me in some files

Post by Lzcat - Tracker Supp »

Your file use embedded font(s) with built-in encoding and does not provide any information how to translate codes to unicodes - this is a reson why we cannot extract correct (greek) text. Adobe Acrobat also cannot extract text correctly.
possible solution - recreate document with required information (embedd ToUnicode tables or use standard encodings).
HTH.
Victor
Tracker Software
Project manager

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19930
Joined: Mon Jan 12, 2009 8:07 am

Re: Text in created Bookmark all greek to me in some files

Post by Stefan - PDF-XChange »

Hello p.kazazis,

It appears like the problem is with the PDF document itself.
Try selecting and copying text from it to say notepad - the result is the same as the "text" added when you try to create a bookmark. The same behaviour is observed in Adobe's Reader - so the problem is not i our Viewer.

There are files like this - that display properly but lack the additional information needed so that proper text extraction could be performed.

Please when creating such files make sure to "Embed extended font/character info" and you will then have no problems like the one you encountered.

Best,
Stefan
p.kazazis
User
Posts: 10
Joined: Mon May 30, 2011 11:15 am

Re: Text in created Bookmark all greek to me in some files

Post by p.kazazis »

The file actually wasn't created by me, it was downloaded from external source.

Is it possible for someone that opens such files (and doesn't have the source content that the pdf was made, apart from the pdf itself) to do something?

Thank you for your answers.
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19930
Joined: Mon Jan 12, 2009 8:07 am

Re: Text in created Bookmark all greek to me in some files

Post by Stefan - PDF-XChange »

Hello p.kazazis,

I am afraid that you will need OCR capable software in order to extract the "text" from this document as proper machine recognizable and selectable text. Otherwise - I can no think of any other way to grab the text correctly.

Best,
Stefan