Tutorial

The tutorial covers a common data extraction scenario - dowloading a product catalogue from an online shopping site. It explains how to submit web forms, work with search results and details pages, download images and extract text fragments from a raw HTML. You will learn how to schedule a project for automatic daily execution. The tutorial will also show how to emulate mouse actions and get data from a pop-up panel.

As an example, we are going to collect a multi-page product catalog from the Amazon web site. Besides product description from the details page, we will collect high resolution product images. As it often happens to online stores, each product on Amazon can have a variable number of images. We will collect them all. The same approach can be used with EBay and many other online shopping sites or real estate listings.

The steps of this tutorial are covered by the video. Before starting your first project, take a look at web scraping project structure and its basic elements

Find a shortcut to Data Toolbar on your desktop or in the Start menu and run the program. After starting the program, you will see a browser window with the Data Tool button in the top left corner of the browser. By default, the program opens Chrome browser but you can switch to Firefox using the dropdown menu. If you do not have Chrome installed, the program will use Firefox. Chrome is more stable and is faster in automation mode than Firefox. That is why we recommend Chrome for long running projects. If you use Chrome 57 or later, you will see a message that Chrome is being controlled by the automation software.

The browser runs in the “private browsing” mode and uses a separate testing profile that does not interfere with the browser settings and history. The browsers do not load any installed extensions. The testing profile is recreated when the browser restarts. After the restart all saved cookies are deleted. You can optionally change project properties to keep cookies with the project.

Chrome browser automation requires a console application chromedriver.exe (chrome automation server) that may trigger a firewall warning on the first run. You can ignore the warning.

If you do not see the DataTool extension icon , click on the Extensions icon in the top right corner of the Chrome browser and pin the extension Data Toolbar to the toolbar.

Get a Free Web Scraping Tool Now

Get a free version of Data Toolbar. The free version has the same functionality as the full version but its output is limited to 100 rows. There is no expiration date. No registration. No ads. See how easy it is for yourself today.

Get a Free Web Scraping Tool Now
download Data Toolbar

The latest production build was released on 2020-03-04. Version 3.4 supports background data srcaping that does not distrupt other applications. Update your program for free if you own any of its previous versions. Check release history here.

Version 4.0, which is to be released in 2020, will significantly improve the performance and the flexibility of the DataTool by using the new data extraction engine based on CSS selectors.