Commit Graph

97 Commits

Author SHA1 Message Date
Nikolai Tschacher
07f3dceba1 fixed google SERP title, better docker support 2019-09-23 16:46:22 +02:00
Nikolai Tschacher
b25f7a4285 added test to my working tree 2019-09-13 18:28:19 +02:00
Nikolai Tschacher
4b581bd03f removed static tests because they are too larege 2019-09-13 18:21:17 +02:00
Nikolai Tschacher
21378dab02 removed some search engines, added tests for existing, added yandex search engines 2019-09-13 16:15:33 +02:00
Nikolai Tschacher
77d6c4f04a removed some stuff 2019-09-12 10:43:57 +02:00
Nikolai Tschacher
b513bb0f5b Merge branch 'master' of github.com:NikolaiT/se-scraper
server in dockerfile was changed
2019-09-04 12:28:05 +02:00
Nikolai Tschacher
855a874f9e some minor changes 2019-09-04 12:27:53 +02:00
Nikolai Tschacher
dde1711d9d
Merge pull request #45 from slotix/master
add process supervisor for starting server.js
2019-08-29 20:41:42 +02:00
slotix
7ba7ee9226 add process supervisor for starting server.js 2019-08-19 14:01:37 +02:00
Nikolai Tschacher
e661241f6f added some parsing to google 2019-08-16 20:10:40 +02:00
Nikolai Tschacher
98414259fe docker support added 2019-08-13 17:35:06 +02:00
Nikolai Tschacher
19a172c654 better tests 2019-08-13 15:28:30 +02:00
Nikolai Tschacher
0f7e89c272 added little bug in cleaning 2019-08-12 17:16:37 +02:00
Nikolai Tschacher
ca941cee45 added static bing test, added html cleaning when exporting html 2019-08-12 16:05:17 +02:00
Nikolai Tschacher
4c77aeba76
Merge pull request #42 from TDenoncin/error-management
Clean integration tests with mocha
2019-08-12 00:04:40 +02:00
Nikolai Tschacher
0427d9f915
Merge branch 'master' into error-management 2019-08-12 00:04:27 +02:00
Nikolai Tschacher
87fcdd35d5 readme in static tests 2019-08-12 00:01:02 +02:00
Nikolai Tschacher
4ca50ab2b9 added new static test case that runs much faster and tests a lot of behavior 2019-08-11 23:58:10 +02:00
Nikolai Tschacher
8e629f6266
Merge pull request #41 from victor9000/master
Fix broken Google News selectors, fixes #40
2019-08-08 21:57:14 +02:00
HugoPoi
a369bd07f9 Add "use strict" to ensure quality code control 2019-08-06 12:18:51 +02:00
HugoPoi
dde2b14fc0 Remove uneeded try catch block in Google Search module 2019-08-06 11:50:08 +02:00
HugoPoi
0db6e068da Remove uneeded try catch block
Add proper error for ip matching test
2019-08-06 11:46:53 +02:00
HugoPoi
50bda275a6 Clean integration tests for mocha 2019-08-05 17:01:48 +02:00
Victor
a61fade2c9 Fix broken Google News selectors, fixes #40 2019-08-04 14:43:02 -07:00
Nikolai Tschacher
78fe12390b better user agents now, added option to include screenshots as base64 in results 2019-07-18 20:19:15 +02:00
Nikolai Tschacher
fcbe66b56b using random user agents now from https://github.com/intoli/user-agents 2019-07-18 19:34:09 +02:00
Nikolai Tschacher
59154694f2 fixed issue https://github.com/NikolaiT/se-scraper/issues/37 2019-07-18 19:14:33 +02:00
Nikolai Tschacher
60a9d52924 add fucking google product information 2019-07-11 19:23:40 +02:00
Nikolai Tschacher
1fc7f0d1c8 fixed a badboy 2019-07-11 16:54:32 +02:00
Nikolai Tschacher
baaff5824e ... 2019-07-11 16:43:41 +02:00
Nikolai Tschacher
dab25f9068 added google shopping results 2019-07-11 16:42:01 +02:00
Nikolai Tschacher
a413cb54ef parsing ads works for duckduckgo, google, bing. tested. 2019-07-07 19:38:28 +02:00
Nikolai Tschacher
bbebe3ce60 parsing ads is supported now for google, bing and duckduckgo 2019-07-06 21:42:13 +02:00
Nikolai Tschacher
09c1255400 removed some superflous stuff 2019-07-02 18:04:01 +02:00
Nikolai Tschacher
5e8ff1cb34 Merge branch 'master' of https://github.com/NikolaiT/se-scraper 2019-06-29 17:01:25 +02:00
Nikolai Tschacher
c1a036e8da removed some stuff 2019-06-29 17:00:50 +02:00
Nikolai Tschacher
d1e9b21269 added google maps scraper 2019-06-29 17:00:19 +02:00
Nikolai Tschacher
593f3a95e5
Merge pull request #33 from TDenoncin/add-html-output-rework
Add html output option
2019-06-26 15:38:38 +02:00
HugoPoi
d9ac9f4162 Add test for html_output, refactor the results return 2019-06-26 12:03:42 +02:00
Thomas
a0e63aa4b0 Use bing_setting.bing_domain if defined for startUrl 2019-06-25 17:16:17 +02:00
Thomas
a3ebe357a4 Add html_output fonctionality
Pagination support for html output
Change return value to keep it compliant to the current version of se-scrapper
2019-06-25 17:02:34 +02:00
Nikolai Tschacher
0d7f6dcd11 worked on issue #31 2019-06-18 22:23:52 +02:00
Nikolai Tschacher
80d23a9d57 users may pass their own user agents, different browsers have random user agents and not the same now 2019-06-17 21:25:45 +02:00
Nikolai Tschacher
ebe9ba8ea9 added option to throw on detection 2019-06-17 15:02:44 +02:00
Nikolai Tschacher
caa93df3b0 random user agent fixed 2019-06-17 12:01:13 +02:00
Nikolai Tschacher
0c9f353cb2 remove hardcoded sleep() in Google Image 2019-06-17 00:03:13 +02:00
Nikolai Tschacher
43d5732de7 resolved issue #30, custom scrapers now possible. new npm version 2019-06-13 12:34:39 +02:00
Nikolai Tschacher
06d500f75c . 2019-06-12 21:25:40 +02:00
Nikolai Tschacher
784e887787 fixed issue #22 2019-06-12 21:25:20 +02:00
Nikolai Tschacher
db5fbb23d2 removed unnecessary sleeping times 2019-06-12 18:14:49 +02:00