

Enter the URL for the page you want to scrape.Now, go back to ParseHub’s new main menu and click on “New Project”.Your integration will now be enabled in ParseHub. You will be asked to login in to Dropbox.After downloading and installing ParseHub, boot it up and click on the Integrations tab on the left sidebar.Make sure to log in to your ParseHub account through.NOTE: You can also download files via Amazon S3 or just download the list of links and use any download manager you’d want to use. So let’s set up our Dropbox integration in ParseHub. Web scraping PDF files Setting up Dropbox integrationįor this example, we will download all files to our Dropbox account. We will use our scraper to extract the links to all these files and download them on to our Dropbox account.

This page has a large list of links to PDF files. For this, we will use ParseHub, a free and powerful web scraper that can scrape any website.įor this example, we will use the Province of British Columbia’s webpage for “Supreme Court Civil Rules Forms”. Scraping a list of PDF Filesįirst, we’ll need to set up our web scraping project. Today, we will use a free web scraper to scrape a list of PDF files from a website and download them all to your drive. If the number of files is large enough, you might be interested in automating the process. There might be scenarios where you might have to download a long list of PDF files from a website. PDF files are still incredibly common on the internet.
