Scrapers

What is a scraper?

A website scraper, also called a web scraper, is a tool that searches web pages and retrieves relevant data from those pages. This data can be used in the context of your chatbot to train the chatbot on the content you want the chatbot to know.

Create

On the “Scrapers” page, click the “New Scraper” button at the top right, or in the middle if there is no scraper yet. Enter a display name (only visible in the dashboard) and indicate the Scrape URL and URL match.

The scrape URL is the base URL of the pages you want to scrape. This could, for example, be a link to the FAQ or to your documentation. The URL match ensures that the scraper only looks at pages with a URL that contains the URL match. For example, if you use “example.nl/faq” as the scrape URL, and you indicate faq in the URL match, the scraper will only look at URLs that contain the text faq. So also “example.nl/faq/hoe-werkt-dit” and “example.nl/faq/wat-is-dit”. As long as those pages exist and are on your website, they will be retrieved. Suppose you have no subpages for your FAQ, then only the page “example.nl/faq” will be retrieved by the scraper.

You can also have your entire website scraped, but we do not always recommend this, because there may also be irrelevant information on some pages of your site that is not useful for the chatbot to know.

Scrape mainly informative pages such as FAQs, manuals, or documentations.

Scraping

After creating the scraper, it will automatically start scraping. The scraper can also be activated by going to the 3 dots on the right side of a scraper on the scrapers overview page and clicking on “Scrape”. Then click the “Start Scraping” button to start scraping.

After scraping, a new file is generated. This can then be used for all your chatbots.

Edit

Scrapers can be edited by going to the 3 dots on the right side of a scraper on the scrapers overview page and clicking on “Edit”.

Delete

Scrapers can be removed by going to the 3 dots on the right side of a scraper on the scrapers overview page and clicking on “Delete”.