Trying to fix a PDF OCR encoding issue
Posted: Mon Aug 05, 2024 5:38 pm
Hi,
I'm trying to fix the OCR on this PDF file which likely had an encoding issue (I don't know how the original OCR was performed).
For the "fix", I tried to Rasterize the file (so that I can OCR with PDFXCE after) with the following tweaks in order to have a good quality
- Compression : JPEG
- JPEG Quality : High
- 300 DPI
The issue is that after the Rasterize action, the file size became very large (537MB).
I'm aware that it is intended as the PDF will exclusively contain high definition images.
I tried to Save as Optimized with the "Standard" profile but unfortunately the file size is still way larger (169MB) than the original file (58MB) and the quality has slightly deteriorated.
Maybe you have a solution for my issue,
Thanks for investigating,
I'm trying to fix the OCR on this PDF file which likely had an encoding issue (I don't know how the original OCR was performed).
For the "fix", I tried to Rasterize the file (so that I can OCR with PDFXCE after) with the following tweaks in order to have a good quality
- Compression : JPEG
- JPEG Quality : High
- 300 DPI
The issue is that after the Rasterize action, the file size became very large (537MB).
I'm aware that it is intended as the PDF will exclusively contain high definition images.
I tried to Save as Optimized with the "Standard" profile but unfortunately the file size is still way larger (169MB) than the original file (58MB) and the quality has slightly deteriorated.
Maybe you have a solution for my issue,
Thanks for investigating,