Nikolai Tschacher
|
593f3a95e5
|
Merge pull request #33 from TDenoncin/add-html-output-rework
Add html output option
|
2019-06-26 15:38:38 +02:00 |
|
HugoPoi
|
d9ac9f4162
|
Add test for html_output, refactor the results return
|
2019-06-26 12:03:42 +02:00 |
|
Thomas
|
a0e63aa4b0
|
Use bing_setting.bing_domain if defined for startUrl
|
2019-06-25 17:16:17 +02:00 |
|
Thomas
|
a3ebe357a4
|
Add html_output fonctionality
Pagination support for html output
Change return value to keep it compliant to the current version of se-scrapper
|
2019-06-25 17:02:34 +02:00 |
|
Nikolai Tschacher
|
0d7f6dcd11
|
worked on issue #31
|
2019-06-18 22:23:52 +02:00 |
|
Nikolai Tschacher
|
80d23a9d57
|
users may pass their own user agents, different browsers have random user agents and not the same now
|
2019-06-17 21:25:45 +02:00 |
|
Nikolai Tschacher
|
ebe9ba8ea9
|
added option to throw on detection
|
2019-06-17 15:02:44 +02:00 |
|
Nikolai Tschacher
|
caa93df3b0
|
random user agent fixed
|
2019-06-17 12:01:13 +02:00 |
|
Nikolai Tschacher
|
0c9f353cb2
|
remove hardcoded sleep() in Google Image
|
2019-06-17 00:03:13 +02:00 |
|
Nikolai Tschacher
|
43d5732de7
|
resolved issue #30, custom scrapers now possible. new npm version
|
2019-06-13 12:34:39 +02:00 |
|
Nikolai Tschacher
|
06d500f75c
|
.
|
2019-06-12 21:25:40 +02:00 |
|
Nikolai Tschacher
|
784e887787
|
fixed issue #22
|
2019-06-12 21:25:20 +02:00 |
|
Nikolai Tschacher
|
db5fbb23d2
|
removed unnecessary sleeping times
|
2019-06-12 18:14:49 +02:00 |
|
Nikolai Tschacher
|
5bf7c94b9a
|
new version
|
2019-06-11 22:01:27 +02:00 |
|
Nikolai Tschacher
|
d4d06f7d67
|
need to edit readme
|
2019-06-11 18:34:51 +02:00 |
|
Nikolai Tschacher
|
35943e7449
|
minor stuff
|
2019-06-11 18:33:11 +02:00 |
|
Nikolai Tschacher
|
7e06944fa1
|
updated README
|
2019-06-11 18:27:34 +02:00 |
|
Nikolai Tschacher
|
6825c97790
|
changed api big time
|
2019-06-11 18:16:59 +02:00 |
|
Nikolai Tschacher
|
3d69f4e249
|
added a proxy test script
|
2019-05-06 21:54:23 +02:00 |
|
Nikolai Tschacher
|
1593759556
|
passing chrome flags directly now possible
|
2019-04-01 15:33:26 +02:00 |
|
Nikolai Tschacher
|
775dcfa077
|
proxy mgmt better
|
2019-03-22 18:55:17 +01:00 |
|
Nikolai Tschacher
|
b82c769bb1
|
google_news_old supports google_news_old_settings now
|
2019-03-20 15:28:04 +01:00 |
|
Nikolai Tschacher
|
1bed9c5854
|
fixed issue 12
|
2019-03-20 11:50:43 +01:00 |
|
Nikolai Tschacher
|
7a8c6f13f0
|
fixed #11 by improving baidu a lot in speed and quality
|
2019-03-14 23:33:46 +01:00 |
|
Nikolai Tschacher
|
51d617442d
|
added support for amazon
|
2019-03-10 20:02:42 +01:00 |
|
Nikolai Tschacher
|
dd1f36076e
|
can now parse args from string to json
|
2019-03-07 15:50:36 +01:00 |
|
Nikolai Tschacher
|
62b3b688b4
|
minor fixes
|
2019-03-07 13:16:12 +01:00 |
|
Nikolai Tschacher
|
7b52b4e62f
|
added suport for custom query string parameters
|
2019-03-06 00:08:25 +01:00 |
|
Nikolai Tschacher
|
7239e23cba
|
fixed pluggable
|
2019-03-03 16:46:10 +01:00 |
|
Nikolai Tschacher
|
8cbf37eaba
|
minor improvements
|
2019-03-02 22:32:26 +01:00 |
|
Nikolai Tschacher
|
abf4458e46
|
fixed quotes in user agent. this lead to cloudflare detecting the scraper. very bad.
|
2019-03-01 16:02:30 +01:00 |
|
Nikolai Tschacher
|
79d32a315a
|
fixed some errors and way better README
|
2019-02-28 15:34:25 +01:00 |
|
Nikolai Tschacher
|
089e410ec6
|
support for multible browsers and proxies
|
2019-02-27 20:58:13 +01:00 |
|
Nikolai Tschacher
|
393b9c0450
|
Merge pull request #8 from NikolaiT/add-license-1
Create LICENSE
|
2019-02-08 00:58:27 +01:00 |
|
Nikolai Tschacher
|
fb3f2836e4
|
Create LICENSE
|
2019-02-08 00:58:15 +01:00 |
|
Nikolai Tschacher
|
53c9ebf467
|
Merge pull request #7 from NikolaiT/add-code-of-conduct-1
Create CODE_OF_CONDUCT.md
|
2019-02-08 00:54:28 +01:00 |
|
Nikolai Tschacher
|
9521c54c77
|
Create CODE_OF_CONDUCT.md
|
2019-02-08 00:54:10 +01:00 |
|
Nikolai Tschacher
|
77c332d7c8
|
updated readme
|
2019-02-07 16:26:11 +01:00 |
|
Nikolai Tschacher
|
7b5048b8ee
|
num_keywords are counted now. added to pluggable
|
2019-02-07 16:21:56 +01:00 |
|
Nikolai Tschacher
|
7572ebd314
|
added chrome detection evasion techniques
|
2019-02-07 16:09:38 +01:00 |
|
Nikolai Tschacher
|
d5b147296e
|
ticker search OOP now and added tests
|
2019-01-31 22:13:22 +01:00 |
|
Nikolai Tschacher
|
d35a602994
|
added clean test cases for bing and duckduckgo
|
2019-01-31 15:36:27 +01:00 |
|
Nikolai Tschacher
|
7441c57a43
|
removed generic tests. too complicated
|
2019-01-31 14:58:07 +01:00 |
|
Nikolai Tschacher
|
c60d0f3528
|
clean test case for google is passing
|
2019-01-31 14:57:34 +01:00 |
|
Nikolai Tschacher
|
987e3d7342
|
tested and works
|
2019-01-30 23:53:09 +01:00 |
|
Nikolai Tschacher
|
581568ff18
|
cleaned up google scrapers. All scrapers are classes now. from 600 LOC to 400 LOC. HIGH IQ MOVE
|
2019-01-30 20:24:03 +01:00 |
|
Nikolai Tschacher
|
4306848657
|
implemented generic scraping class
|
2019-01-30 16:05:08 +01:00 |
|
Nikolai Tschacher
|
9e62f23451
|
resolved some issues. proxy possible now. scraping for more than one page possible now
|
2019-01-29 22:48:08 +01:00 |
|
Nikolai Tschacher
|
89441070cd
|
before_keyword_scraped() hook supported
|
2019-01-29 13:29:24 +01:00 |
|
Nikolai Tschacher
|
c5e3e84e1d
|
minor changes
|
2019-01-27 22:11:41 +01:00 |
|