Worker Type

This article explains the behavior difference between worker types, how to choose the right one for your project


What is the difference between Browser Worker and Code Worker?

  • Browser Workers:
    • can simulate a user's interaction with the website via a headless browser
    • Browser worker is more expensive to use, in terms of CPM (Cost Per Thousand page loads)
    • Handles complex scraping tasks like filling forms, and dynamic content loading. 
  • Code workers:
    • Roughly equivalent to doing a curl or python `requests.get(url)`
    • Work by sending HTTP requests to the target website
    • Much cheaper
    • Can only work in situations that don't require interacting with the website UI


How to choose the optimal type for the scraper

You should choose the right worker type based on the technology used by the website you want to scrape, and the navigation needed for scraping the data you need. 

It's good to start with the cheaper code workers and only change if you find that you cant' get the data you want.

For example:

  • If you need to click on element to load some more data
  • If you need to use scroll for load more elements 
  • If you need to use tag_script, tag_response (capture network traffic from inside the browser)
  • if you need to type some text to get data on the website to do a search


Your code should be aligned with the worker type

Some functions in our library are only available when using browsers and will throw an error if you try to use them from code workers.

Below is a list of function that you can only use from browser workers:

  • wait_* (any wait function)
  • scroll_* (any scroll function)
  • tag_* (any tag function)
  • type
  • browser_size
  • emulate_device
  • freeze_page
  • click
  • hover
  • right_click
  • mouse_to
  • press_key
  • solve_captcha
  • capture_graphql
  • close_popup

Was this article helpful?