How to extract URLs in PDF?

Forum for the PDF-XChange Editor - Free and Licensed Versions

Moderators: PDF-XChange Support, Daniel - PDF-XChange, Chris - PDF-XChange, Sean - PDF-XChange, Paul - PDF-XChange, Vasyl - PDF-XChange, Ivan - Tracker Software, Stefan - PDF-XChange

netsonicyxf
User
Posts: 98
Joined: Sun Jul 01, 2012 2:19 pm

How to extract URLs in PDF?

Post by netsonicyxf »

The PDF is attached.
2021-RADBUG-Virtual-Meeting.pdf
You do not have the required permissions to view the files attached to this post.
User avatar
rakunavi
User
Posts: 1823
Joined: Sat Sep 11, 2021 5:04 am

Re: How to extract URLs in PDF?

Post by rakunavi »

Hi netsonicyxf,

If the document is not secured, all link information can be exported as an XFDF file using "Export Links..." in the Options menu of the Links pane. A Feature Request has already been issued, but currently the only way is to process the XFDF file in the user's preferred way.

Best regards,
rakunavi
TOP desires for PDFXCE
forum.pdf-xchange.com/viewtopic.php?t=39665 LassoTool
forum.pdf-xchange.com/viewtopic.php?t=38554 CmtGarbled
forum.pdf-xchange.com/viewtopic.php?t=37353 FulScrMultiMon
forum.pdf-xchange.com/viewtopic.php?t=41002 DisableTouchSelect
netsonicyxf
User
Posts: 98
Joined: Sun Jul 01, 2012 2:19 pm

Re: How to extract URLs in PDF?

Post by netsonicyxf »

Hi rakunavi,

Thank you. But How do I get all these links (as text)?

Best regards
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19913
Joined: Mon Jan 12, 2009 8:07 am

Re: How to extract URLs in PDF?

Post by Stefan - PDF-XChange »

Hello netsonicyxf,

You can not currently do that. Once the feature request from the other topic is done, and the feature added to the Editor you would be able to export to other file formats than .fdf /.xfdf.

Kind regards,
Stefan
User avatar
rakunavi
User
Posts: 1823
Joined: Sat Sep 11, 2021 5:04 am

Re: How to extract URLs in PDF?

Post by rakunavi »

Hi netsonicyxf,

With all due respect, I have no way to know what you don't understand, so it's difficult for me to comment. Even if it were clear, it might be difficult in this forum to give you hands-on advice on how to operate after outputting with XFDF, since the forum rules prohibit comments that are not directly related to Tracker-Software's products.

If you don't know how to output XFDF, you will have to look at the PDF-XChange Editor manual.

If you don't know how to handle text files, you will have to learn the basics of Windows.

If you don't know how to use regular expressions, you can learn from the following excellent site. The results are displayed in real time, allowing you to efficiently deepen your understanding of regular expressions.

Of course, you should also read the basic documents such as those cited in the PDF-XChange Editor Help. There are many excellent resources on the web about regular expressions.

Based on the format of URLs included in XFDF, the following pattern should be sufficient.

Code: Select all

https?:[^"]+
The PDF-XChange Editor itself has some settings where regular expressions can be used, such as replacing bookmark titles, so it is good to know about them.

Normally, it would be necessary to install a grep tool in addition to the above, but even if you don't go that far, the above site (https://regex101.com/) has a function to copy text matched with regular expressions, so that may be all you need to do.

Hoping that the above information will be of some help to you.

Best regards,
rakunavi
TOP desires for PDFXCE
forum.pdf-xchange.com/viewtopic.php?t=39665 LassoTool
forum.pdf-xchange.com/viewtopic.php?t=38554 CmtGarbled
forum.pdf-xchange.com/viewtopic.php?t=37353 FulScrMultiMon
forum.pdf-xchange.com/viewtopic.php?t=41002 DisableTouchSelect
User avatar
Stefan - PDF-XChange
Site Admin
Posts: 19913
Joined: Mon Jan 12, 2009 8:07 am

Re: How to extract URLs in PDF?

Post by Stefan - PDF-XChange »

Hello rakunavi,

Many thanks for the above!
I hope @netsonicyxf will put the advice to good use!

Kind regards,
Stefan