se-scraper/TODO.txt

24.12.2018
    - fix interface to scrape() [DONE]
    - add to Github


24.1.2018

    - fix issue #3: add functionality to add keyword file

27.1.2019

    - Add functionality to block images and CSS from loading as described here:

        https://www.scrapehero.com/how-to-increase-web-scraping-speed-using-puppeteer/
        https://www.scrapehero.com/how-to-build-a-web-scraper-using-puppeteer-and-node-js/

29.1.2019

    - implement proxy support functionality
        - implement proxy check

    - implement scraping more than 1 page
        - do it for google
        - and bing

    - implement duckduckgo scraping


30.1.2019

    - modify all scrapers to use the generic class where it makes sense
        - Bing, Baidu, Google, Duckduckgo

TODO:
    - think about implementing ticker search for: https://quotes.wsj.com/MSFT?mod=searchresults_companyquotes
    - add proxy support
    - add captcha service solving support
    - check if news instances run the same browser and if we can have one proxy per tab wokers

TODO:
    - think whether it makes sense to introduce a generic scraping class?
    - is scraping abstractable or is every scraper too unique?
    - dont make the same mistakes as with GoogleScraper


TODO:
    okay its fucking time to make a generic scraping class like in GoogleScraper
    i feel like history repeats

    class Scraper

        constructor(options = {}) {

        }

        async load_search_engine() {}

        async search_keyword() {}

        async new_page() {}

        async detected() {}


    then each search engine derives from this generic class

    some search engines do not seed such a abstract class, because they are too complex