splitting pdf files

jusWest · Post by **jusWest** » Fri Jan 28, 2022 2:25 pm

What is the considered the best way to split large documents into smaller files in pdfxchange sdk?

Today we use op.document.extractPages for this, sending in a range of pages to send to the target pdf.

On very large files, like 1.5 GB, with a lot of images, this can take a long time.

Other sdk's I have tried is much faster at this, so that makes me beleive I'm doing it wrong.

Regards
Ronny

Post by **Vasyl - PDF-XChange** » Tue Feb 01, 2022 2:34 am

Hi Ronny.

Seems there can be a document-specific issue... Can we ask for any test doc to reproduce your issue on our side?

Also please provide options you used for op.document.extractPages.

Thanks.

jusWest · Post by **jusWest** » Tue Feb 01, 2022 7:01 am

I can only reproduce with certain legal documents, and I have none that I can share because of legal issues.

I use the 64 bit version of the library, when I try this on a 32 bit project it goes much faster, but on a couple of
my test files it then crashes in this function with a unknown error.

Memory consumption on the 64 bit version is sometimes up to 3.5 GB

Here is the code:

Code: Select all

public void ExtractPages(IPXC_Document doc, 
                                 string pageRange, 
                                 string outputPath, 
                                 string destDocName = "", 
                                 int commentsAction = 0,
                                 int bookmarksAction = 2,
                                 int extractPagesAction = 1,
                                 bool openFolder = false)
        {
            try
            {
                var nID = _Inst.Str2ID("op.document.extractPages", false);
                var Op = _Inst.CreateOp(nID);
                ICabNode input = Op.Params.Root["Input"];
                input.v = doc;

                // https://sdkhelp.pdf-xchange.com/view/PXV:op_document_extractPages_Options
                ICabNode options = Op.Params.Root["Options"];
                options["PagesRange.Type"].v = "Exact";
                options["PagesRange.Text"].v = pageRange;
                options["CommentsAction"].v = commentsAction;           // 0 (Copy), 1 (Flatten), 2 (DontCopy)
                options["BookmarksAction"].v = bookmarksAction;         // 0 (DontCopy), 1 (CopyAll), 2 (CopyRelated)
                options["DeletePages"].v = false;
                options["ExtractPagesAction"].v = extractPagesAction;   // 0 (AllToOneDoc), 1 (AllToOneFile), 2 (EachToFile), 3 (EachRangeToFile)
                options["OverwriteAll"].v = true;

                if (!string.IsNullOrEmpty(destDocName))
                    options["FileName"].v = destDocName;
                else
                    options["FileName"].v = "%[FileName]";

                options["LocalFolder"].v = outputPath;
                options["OpenFolder"].v = openFolder;

                Op.Do();
            }
            catch (Exception ex)
            {
                IAUX_Inst auxInst = (IAUX_Inst)_Inst.GetExtension("AUX");
                HasError = true;
                ErrorMessage = auxInst.FormatHRESULT(ex.HResult);
                var fileName = Path.GetFileName(destDocName);
                _AppLogger?.Error("PdfToolkit:ExtractPages(" + fileName + ") => " + ex.Message + ", (" + ErrorMessage + ")");
            }

        }

Wed Feb 02, 2022 12:57 am

Hi Ronny.

Try to use:
1. DontCopy for FieldsAction, BookmarksAction and CommentsAction. Just for experiment, to see if it affects performance. Please let us know if this has any effect.
2. In case when you multiple times run the extractPages-op for the same doc but with different pages-ranges - you may modify your code to run it once per doc, via setting the complex page-range like: 1-5, 10-20, ..., or via using the "PagesRange.Array" option.

Also some additional questions:
1. How many pages are in your doc?
2. Does it have bookmarks, comments, links, form-fields, named-destinations?
3. Do you extract all pages at once or multiple page-ranges and multiple times?

Cheers.

Wed Mar 11, 2026 3:24 pm