OCR with translation

sunshineweb · Post by **sunshineweb** » Thu Jan 02, 2025 6:54 pm

Maybe this option already exists, but it would definitely be helpful:
Recognized text should be translatable into one's own language. This would be particularly useful for me when dealing with medical reports in foreign languages. It would probably require integrating an online service. I find the translations from ChatGPT to be useful (though not perfect).

Thu Jan 02, 2025 8:34 pm

Hello, sunshineweb

Sadly, this is highly complex, especially when factoring in that OCR results are not perfect. As an example, trying to translate a phrase which OCR recognized as:
"Danie1 W0rks in 5upport"
Often will not give ideal results, even when taking into account any possible AI interpretations (See the suggestion Google's AI has made for me)

image.png

If you are looking for a more in-depth explanation of the complications, see below. Otherwise, the paragraph below is just reiterating the above in more words for anyone who is curious.
While it would be a handy feature to offer, such handling would be expensive, either working locally, or requiring us to have an ongoing agreement with a third party to handle the translations or AI managed suggestions (or both, and either would likely translate to a product price increase for all users). Many people are also quite averse to AI integration/features these days, which could complicate the implementation further, requiring us to ensure that its features are fully optional, and can be entirely disabled for users who do not wish to grant AI access to their content.

We would also need to account for page positioning, as many languages are considerably more (or less) verbose in their sentence structure. Converting multiple paragraphs of text on the page could result in one page having half the content on it, while another expands so much that the paragraphs are overlapping one another. In PDF, there is no association between different paragraph/text "Blocks" (Infact, there is not even a relation between two separate lines of text in a single paragraph, or sometimes each letter in a single line), so determining where and when to enact text wrapping, and affect other nearby content is a very tall order.
Similarly, such actions may sometimes need to be capable of overflowing text onto the next page (or removing pages), which is nearly impossible to detect and offer a comfortable handling on. Then we have items like links, highlights, bookmarks and other "text associated" objects, which sadly once again, in PDF are entirely disconnected from one another and the "text" they may overlap or point to, and cannot reliably be updated to match when changes are made.

It may be something we consider as an option in the future, when the respective technologies are actually capable enough to be worthy of such an implementation, and we can ensure that all of the pre-requisite features are in place to allow these conversions to be handled properly, but at the moment there are too many prohibitive impacts, potential issues, and costs involved to offer a promise of it coming along anytime soon.

Kind regards,

OCR with translation

OCR with translation

Re: OCR with translation