Johannes Zillmann
|
07e7fbb505
|
[WIP] Add remove whitespace and detect links again
|
2017-03-18 08:56:08 +01:00 |
|
Johannes Zillmann
|
4600dc6ee7
|
[WIP] headline detection for non TOC pdfs
|
2017-03-16 07:40:57 +01:00 |
|
Johannes Zillmann
|
77576ebd7e
|
[WIP] Headlines for title pages
|
2017-03-16 07:08:46 +01:00 |
|
Johannes Zillmann
|
1eda51c0b4
|
[WIP] detect more headlines with already detected heights
|
2017-03-16 06:52:45 +01:00 |
|
Johannes Zillmann
|
a9b851ceb6
|
[WIP] robustify TOC headline finding
|
2017-03-16 06:01:07 +01:00 |
|
Johannes Zillmann
|
dbd9d8bf5f
|
[WIP] find not found TOC-Headers by size
|
2017-03-15 08:42:46 +01:00 |
|
Johannes Zillmann
|
93f15a38b5
|
[WIP] move different typed transformations to different folders
|
2017-03-15 06:09:18 +01:00 |
|
Johannes Zillmann
|
739d20d83b
|
[WIP] Simplify major headline detections
|
2017-03-15 05:27:47 +01:00 |
|
Johannes Zillmann
|
5caf8154db
|
[WIP] Simplify code/quote detection
|
2017-03-14 10:30:21 +01:00 |
|
Johannes Zillmann
|
c6f592d3fc
|
[WIP] Simplify list detection
|
2017-03-11 13:42:09 +01:00 |
|
Johannes Zillmann
|
f8fecc4c1d
|
[WIP] remove MarkdownElement in favor of ElementType enum
|
2017-03-10 12:39:42 +01:00 |
|
Johannes Zillmann
|
15c5946073
|
[WIP] remove explicit Footnotes transformation
|
2017-03-10 12:12:20 +01:00 |
|
Johannes Zillmann
|
68e3fd7a9f
|
[WIP] change gather blocks transformation to new system
|
2017-03-10 12:10:58 +01:00 |
|
Johannes Zillmann
|
bd4c207ae3
|
[WIP] detect TOC on text items, not on blocks
|
2017-03-10 09:52:29 +01:00 |
|
Johannes Zillmann
|
e2481bdd2a
|
[WIP] Compact Lines
* Almost every transformer first combines the lines, so we can make it an explicit one time transformation in the beginning
|
2017-03-10 08:49:40 +01:00 |
|
Johannes Zillmann
|
e2ddf0312b
|
[WIP] move unused stuff in separate folder
|
2017-03-10 06:30:18 +01:00 |
|
Johannes Zillmann
|
111124fbf3
|
[WIP] Cleanup page / item handling
|
2017-03-07 21:59:15 +01:00 |
|
Johannes Zillmann
|
6f69566e98
|
[WIP] TOC headline parsing
|
2017-03-07 18:43:43 +01:00 |
|
Johannes Zillmann
|
c9352d8396
|
[WIP] improve TOC parsing
|
2017-03-07 18:43:31 +01:00 |
|
Johannes Zillmann
|
1fcd08f6d5
|
[WIP] small fixes
|
2017-02-27 21:19:29 +01:00 |
|
Johannes Zillmann
|
5827379d1b
|
WIP footer detection
|
2017-02-22 23:18:49 +01:00 |
|
Johannes Zillmann
|
b7db48af4b
|
WIP globalize display of globals and summary/messages
|
2017-02-21 08:05:00 +01:00 |
|
Johannes Zillmann
|
62fd0155ed
|
WIP Proper footnote link detection
|
2017-02-20 21:58:37 +01:00 |
|
Johannes Zillmann
|
a3b6a26437
|
WIP add detect Lists function
|
2017-02-19 14:23:35 +01:00 |
|
Johannes Zillmann
|
edfa76b033
|
WIP fix bugs
|
2017-02-19 11:05:41 +01:00 |
|
Johannes Zillmann
|
2783d724e5
|
WIP initial TOC detection
|
2017-02-19 10:20:14 +01:00 |
|
Johannes Zillmann
|
bed3fd357b
|
WIP merge successive code blocks
|
2017-02-18 12:33:21 +01:00 |
|
Johannes Zillmann
|
e7ff939351
|
WIP markdown formatting for code/quote
|
2017-02-18 11:46:13 +01:00 |
|
Johannes Zillmann
|
f93d1e4aa1
|
WIP initial quote/code detector with new TextItemCombiner
|
2017-02-18 10:50:54 +01:00 |
|
Johannes Zillmann
|
d78d9be8a3
|
WIP rename splitIntoBlocks to DetectPdfBlocks
|
2017-02-17 20:19:57 +01:00 |
|
Johannes Zillmann
|
767462bc9b
|
WIP Introduce PdfBlockView
* Add vertical to horizontal transformation
* Improve header/footer removal
|
2017-02-17 20:17:04 +01:00 |
|
Johannes Zillmann
|
a92e384249
|
Calculate most used distance
* round coordinates on construction
|
2017-02-17 09:01:12 +01:00 |
|
Johannes Zillmann
|
b7393fc806
|
Detect bold and emphasis
|
2017-02-17 08:16:27 +01:00 |
|
Johannes Zillmann
|
6441580889
|
Add global statistics
|
2017-02-15 07:33:07 +01:00 |
|
Johannes Zillmann
|
a76dac6428
|
Summary for detect footnotes
|
2017-02-15 07:11:26 +01:00 |
|
Johannes Zillmann
|
55506576f5
|
Pimp up transformation pipeline with ParseResult object
|
2017-02-15 07:03:44 +01:00 |
|
Johannes Zillmann
|
c08105ecaf
|
Show pdf item text in pre only for the whitespace transformation
|
2017-02-14 22:03:58 +01:00 |
|
Johannes Zillmann
|
41bc2f6c34
|
Move pageView construction into Transformer
|
2017-02-14 21:47:54 +01:00 |
|
Johannes Zillmann
|
92a4337387
|
Add font info to pdf page view
|
2017-02-14 20:28:01 +01:00 |
|
Johannes Zillmann
|
ab5705cd27
|
combine on Y with variation of 1 (instead of being strict)
|
2017-02-14 20:24:01 +01:00 |
|
Johannes Zillmann
|
a1222544bb
|
Remove unused class
|
2017-02-14 20:23:22 +01:00 |
|
Johannes Zillmann
|
3a1241896b
|
Replace text with block system
|
2017-02-12 19:37:21 +01:00 |
|
Johannes Zillmann
|
1ca9fa4362
|
Outsource annotation definitions
|
2017-02-11 15:42:30 +01:00 |
|
Johannes Zillmann
|
996e5fae62
|
Detect Links & Remove Whitespaces
|
2017-02-11 15:23:01 +01:00 |
|
Johannes Zillmann
|
fc0aafebdd
|
Render pdf items as pre elements to see duplicate whitespaces
|
2017-02-11 15:13:45 +01:00 |
|
Johannes Zillmann
|
f0491af073
|
nice fonts
|
2017-02-11 15:02:13 +01:00 |
|
Johannes Zillmann
|
b31ad64fb7
|
Introduce Result View
|
2017-02-06 19:13:43 +01:00 |
|
Johannes Zillmann
|
b7634423cc
|
Add Markdown view
|
2017-02-06 17:13:41 +01:00 |
|
Johannes Zillmann
|
0a6242b944
|
update dependencies
|
2017-02-05 23:21:36 +01:00 |
|
Johannes Zillmann
|
1b326a9f36
|
Headline to upper case transformation
* Add testing capability (mocha, chai)
* Add MarkdownElement to text item
|
2017-02-05 21:22:42 +01:00 |
|