se-scraper/TODO.txt

24.12.2018
    - fix interface to scrape() [DONE]
    - add to Github


24.1.2018

    - fix issue #3: add functionality to add keyword file

27.1.2019

    - Add functionality to block images and CSS from loading as described here:

        https://www.scrapehero.com/how-to-increase-web-scraping-speed-using-puppeteer/
        https://www.scrapehero.com/how-to-build-a-web-scraper-using-puppeteer-and-node-js/

TODO:
    - think about implementing ticker search for: https://quotes.wsj.com/MSFT?mod=searchresults_companyquotes
    - add proxy support
    - add captcha service solving support
    - check if news instances run the same browser and if we can have one proxy per tab wokers

TODO:
    - think whether it makes sense to introduce a generic scraping class?
    - is scraping abstractable or is every scraper too unique?
    - dont make the same mistakes as with GoogleScraper
initial 2018-12-24 14:25:02 +01:00			`24.12.2018`
			`- fix interface to scrape() [DONE]`
			`- add to Github`

supporting yahoo ticker search for news 2019-01-24 15:50:03 +01:00
			`24.1.2018`

			`- fix issue #3: add functionality to add keyword file`

faster scraping, added ticker search engines 2019-01-27 01:27:52 +01:00			`27.1.2019`

			`- Add functionality to block images and CSS from loading as described here:`

			`https://www.scrapehero.com/how-to-increase-web-scraping-speed-using-puppeteer/`
added pluggable functionality 2019-01-27 15:54:56 +01:00			`https://www.scrapehero.com/how-to-build-a-web-scraper-using-puppeteer-and-node-js/`
faster scraping, added ticker search engines 2019-01-27 01:27:52 +01:00
initial 2018-12-24 14:25:02 +01:00			`TODO:`
before_keyword_scraped() hook supported 2019-01-29 13:29:24 +01:00			`- think about implementing ticker search for: https://quotes.wsj.com/MSFT?mod=searchresults_companyquotes`
initial 2018-12-24 14:25:02 +01:00			`- add proxy support`
			`- add captcha service solving support`
before_keyword_scraped() hook supported 2019-01-29 13:29:24 +01:00			`- check if news instances run the same browser and if we can have one proxy per tab wokers`

			`TODO:`
			`- think whether it makes sense to introduce a generic scraping class?`
			`- is scraping abstractable or is every scraper too unique?`
			`- dont make the same mistakes as with GoogleScraper`