OCR Options

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

stefanya42
User
Posts: 2
Joined: Fri Jul 04, 2025 4:37 pm

OCR Options

Post by stefanya42 »

I've just started trying out PDF-XE's OCR capabilities, and I'm impressed with the accuracy of the text generation.

I was previously familiar with the ABBYY FineReader OCR application and appreciate its features, including:
  • allowing me to specify which area(s) of a page I want converted to text (allowing for (e.g.) omitting titles, page numbers, and text in embedded images)
  • clarifying unusual column layouts or callouts
  • special handling for tabular data
  • specifying the order in which regions are handled
Does PDF-XE have any sort of "OCR expert mode" that offers more control over the OCR process like that?
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 11345
Joined: Wed Jan 03, 2018 6:52 pm

Re: OCR Options

Post by Daniel - PDF-XChange »

Hello, stefanya42

For your first and fourth items, the "select region" tool (On the home tab, under the "select" dropdown) would be the closest option, you can use it to draw a region on the page, then right click and choose "OCR region" as needed.

For your 3rd option, I assume by tabular you mean "fillable form fields"? In which case, you would be looking for the "Identify forms" tool, located on the Form tab.
If instead you mean table formatting, than the OCR pages tool offers an option to "draw lines for tables" (Which should be enabled by default). You can use this in conjunction with an option in the Editor's "Preferences" menu (Ctrl_K) under the page text category, to "detect tables in text" which can make later editing of those items much more familiar. (This second option is disable by default as it can cause some issues in other areas of editing, so you may need to toggle this on and off for different documents).

Finally, your second option. No OCR option we currently offer handles column layouts at the moment. Technically speaking, OCR has no bearing on column layout, as all it does is adjust text like content into actual text, where it exists on the page, However...
The aforementioned "table" detection settings should have an influence on this, but PDF is very different from flow based formats like Word. Specifically as PDF is coordinate based, there is no concept of a paragraph, nor any way to connect two blocks of text permanently. Each line of text, sometimes each word, or even individual letters, are entirely separate objects, and we have to emulate a "flowing" editing mode.
The only way to offer such functions is with automatic "assumptions" made based on page layout, which are very hard to develop. We are working on such a "visual layout" editing mode, which may later on offer some of these functions, but that is likely a very long term implementation, and not something I can promise is coming any time soon.

Kind regards,
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
stefanya42
User
Posts: 2
Joined: Fri Jul 04, 2025 4:37 pm

Re: OCR Options

Post by stefanya42 »

Hello Dan:

Thank you for the detailed reply. It sounds like several of my issues are indeed covered in PDFXE's OCR, which I'm delighted about. I'll try your suggestions!

Regards,

Stef
User avatar
Daniel - PDF-XChange
Site Admin
Posts: 11345
Joined: Wed Jan 03, 2018 6:52 pm

OCR Options

Post by Daniel - PDF-XChange »

:)
Dan McIntyre - Support Technician
PDF-XChange Co. LTD

+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com