Page 1 of 1

Very slow processing speed

Posted: Wed Jun 26, 2024 3:38 pm
by Loki@99
Hi,

For some reason, OCR processing is very slow with this file (40 minutes 15 seconds on my device with only PDFXCE running, nothing in background)
File sample_Slow OCR.pdf

Device specification
- PDFXCE 10.3.1 build 387
- CPU : Intel Core i5 i5-1130G7
  • 4 Cores/8 Threads
  • Base frequency : 1.80 Ghz / Turbo frequency : 4.00 Ghz
- RAM : 16 Go DDR4 at 3733 Mhz
- SSD NVME Gen 3

I'm aware that OCR can be a heavy task and that my device isn't a high-end one but it doesn't take that much time when processing other pretty similar PDF (I mean the pages layout).

PDFXCE OCR settings
OCR engine : Enhanced (FineReader)
image.png

I wonder if there is an area for improvement at your side.

Thanks for investigating,

Re: Very slow processing speed

Posted: Wed Jun 26, 2024 10:47 pm
by Daniel - PDF-XChange
Hello, Loki@99

This is a very heavy document, with some fuzzyness and a decent amount of blemishes, as well as being entirely image based (thankfully the images are not particularly high resolution). All of these items require extra processing to be handled properly by the OCR and so in cases like this, OCR can see extended processing time.
I will pass this along to the Dev team, but I expect this will be one of the cases where overnight actions will not have a big effect in the next release, and it will just see gradual improvements to processing speed over time instead.

I will say though, that selecting "high" accuracy is not the correct choice in this case.
Auto should almost always be used since it determines the correct mode to use on a region by region basis, but the accuracy setting defines the quality of the document, not the OCR being performed. Since this is a fuzzy file, if manually controlling it, you should actually be using medium or even low accuracy for this file. This will avoid over-processing and over-analyzing the page content which result in slower speeds and more "extraneous" characters in improper locations.

Kind regards,