Getting OCR to recognize non-standard diacritics (Chinese pinyin)?
Moderators: Tracker Support, TrackerSupp-Daniel, Sean - Tracker, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Ivan - Tracker Software, Tracker Supp-Stefan
Getting OCR to recognize non-standard diacritics (Chinese pinyin)?
I have run OCR on some scanned Chinese books, and while it has no problem with the Chinese characters themselves, it can't encode pinyin properly. What I mean by that is the document looks fine, but if you try to copy any pinyin and paste it elsewhere it becomes nonsense, e.g. shTyong. It's seem like the tone marks are confusing the OCR, and I'm not sure how to fix this. I chose Simplified CN, Traditional CN, and English for the languages, but pinyin isn't an option, probably because it's not exactly language. Yet you can type in pinyin so it must be possible to encode it properly.
- Dimitar - Tracker Supp
- Site Admin
- Posts: 2016
- Joined: Mon Jan 15, 2018 9:01 am
Re: Getting OCR to recognize non-standard diacritics (Chinese pinyin)?
Hello lyanna,
Welcome to our Forum.
May I ask you for a copy of one of the files you are having this issue with, as well as its converted copy?
Also, please send us a screenshot of the OCR settings you are using.
You may send us the files to our support email address: sales@pdf-xchange.com
Regards.
Welcome to our Forum.
May I ask you for a copy of one of the files you are having this issue with, as well as its converted copy?
Also, please send us a screenshot of the OCR settings you are using.
You may send us the files to our support email address: sales@pdf-xchange.com
Regards.