Commit Graph

145 Commits

Author SHA1 Message Date
Johannes Zillmann
ef0bd7ebbe Add example files 2017-03-29 08:17:14 +02:00
Johannes Zillmann
5ef6c362b0 Affix for debug panel 2017-03-29 08:15:39 +02:00
Johannes Zillmann
96d4f72889 Fix loading when PDF has non-resolvable fonts
* Sometimes pdf.js gave fontId like Helvetica & Times instead of g_d0_f1, etc…, but those never got resolved through the callback. Now we simply ignore those.
* Also fixed that no fonts could be parsed
2017-03-29 08:11:14 +02:00
Johannes Zillmann
c4c23ac6ee Add Footer & some messaging 2017-03-29 08:08:55 +02:00
Johannes Zillmann
a0c5bb29d6 rename element type to block type 2017-03-28 09:11:00 +02:00
Johannes Zillmann
c4679238cd Improve list detection
* Add ‘ ‘ on compact lines when line starts with list character
* Add – as list character
* rename functions.jsx to stringFunctions.jsx
2017-03-28 09:00:21 +02:00
Johannes Zillmann
106e2bfa8e separate type and format for a word 2017-03-28 08:15:27 +02:00
Johannes Zillmann
9dbc57b4fe [DONE] format words properly 2017-03-28 06:11:42 +02:00
Johannes Zillmann
09facb09b4 WIP Introduce word/wordType/lineItem
* Way to do the markdown transformation of inline formats (bold, italic, link, footnote, etc..) at the end and not in the middle
* Introduce StashingStream as a helper
2017-03-27 07:34:58 +02:00
Johannes Zillmann
fde670e83f [DONE] fix formatting - all functionality restored*
* the formats in code/quote blocks are still disturbing…
2017-03-24 21:06:35 +01:00
Johannes Zillmann
e144d6a6d5 [WIP] stabilized formatting 2017-03-24 13:31:56 +01:00
Johannes Zillmann
10cc7cf0ab [WIP] first draft complete formats transformation 2017-03-24 12:30:35 +01:00
Johannes Zillmann
81518a857b [WIP] don’t make paragraph bolds to headline 2017-03-24 08:06:54 +01:00
Johannes Zillmann
e19294f35f [WIP] remove old stuff 2017-03-24 08:05:59 +01:00
Johannes Zillmann
bd7d9bc0e9 [WIP] Switch order of Debug & Result view 2017-03-24 07:08:54 +01:00
Johannes Zillmann
d927b45087 [WIP] use fontMap to map fonts to formats 2017-03-22 20:08:34 +01:00
Johannes Zillmann
b5bb56b647 [WIP] parse metadata & display title 2017-03-22 07:19:21 +01:00
Johannes Zillmann
94c2561717 [WIP] store font-map in appState 2017-03-21 23:12:45 +01:00
Johannes Zillmann
a35ecd28b6 [WIP] add headers for all Uppercase lines 2017-03-20 07:10:43 +01:00
Johannes Zillmann
07e7fbb505 [WIP] Add remove whitespace and detect links again 2017-03-18 08:56:08 +01:00
Johannes Zillmann
4600dc6ee7 [WIP] headline detection for non TOC pdfs 2017-03-16 07:40:57 +01:00
Johannes Zillmann
77576ebd7e [WIP] Headlines for title pages 2017-03-16 07:08:46 +01:00
Johannes Zillmann
1eda51c0b4 [WIP] detect more headlines with already detected heights 2017-03-16 06:52:45 +01:00
Johannes Zillmann
a9b851ceb6 [WIP] robustify TOC headline finding 2017-03-16 06:01:07 +01:00
Johannes Zillmann
dbd9d8bf5f [WIP] find not found TOC-Headers by size 2017-03-15 08:42:46 +01:00
Johannes Zillmann
93f15a38b5 [WIP] move different typed transformations to different folders 2017-03-15 06:09:18 +01:00
Johannes Zillmann
739d20d83b [WIP] Simplify major headline detections 2017-03-15 05:27:47 +01:00
Johannes Zillmann
5caf8154db [WIP] Simplify code/quote detection 2017-03-14 10:30:21 +01:00
Johannes Zillmann
c6f592d3fc [WIP] Simplify list detection 2017-03-11 13:42:09 +01:00
Johannes Zillmann
f8fecc4c1d [WIP] remove MarkdownElement in favor of ElementType enum 2017-03-10 12:39:42 +01:00
Johannes Zillmann
15c5946073 [WIP] remove explicit Footnotes transformation 2017-03-10 12:12:20 +01:00
Johannes Zillmann
68e3fd7a9f [WIP] change gather blocks transformation to new system 2017-03-10 12:10:58 +01:00
Johannes Zillmann
bd4c207ae3 [WIP] detect TOC on text items, not on blocks 2017-03-10 09:52:29 +01:00
Johannes Zillmann
e2481bdd2a [WIP] Compact Lines
* Almost every transformer first combines the lines, so we can make it an explicit one time transformation in the beginning
2017-03-10 08:49:40 +01:00
Johannes Zillmann
e2ddf0312b [WIP] move unused stuff in separate folder 2017-03-10 06:30:18 +01:00
Johannes Zillmann
111124fbf3 [WIP] Cleanup page / item handling 2017-03-07 21:59:15 +01:00
Johannes Zillmann
6f69566e98 [WIP] TOC headline parsing 2017-03-07 18:43:43 +01:00
Johannes Zillmann
c9352d8396 [WIP] improve TOC parsing 2017-03-07 18:43:31 +01:00
Johannes Zillmann
1fcd08f6d5 [WIP] small fixes 2017-02-27 21:19:29 +01:00
Johannes Zillmann
5827379d1b WIP footer detection 2017-02-22 23:18:49 +01:00
Johannes Zillmann
b7db48af4b WIP globalize display of globals and summary/messages 2017-02-21 08:05:00 +01:00
Johannes Zillmann
62fd0155ed WIP Proper footnote link detection 2017-02-20 21:58:37 +01:00
Johannes Zillmann
a3b6a26437 WIP add detect Lists function 2017-02-19 14:23:35 +01:00
Johannes Zillmann
edfa76b033 WIP fix bugs 2017-02-19 11:05:41 +01:00
Johannes Zillmann
2783d724e5 WIP initial TOC detection 2017-02-19 10:20:14 +01:00
Johannes Zillmann
bed3fd357b WIP merge successive code blocks 2017-02-18 12:33:21 +01:00
Johannes Zillmann
e7ff939351 WIP markdown formatting for code/quote 2017-02-18 11:46:13 +01:00
Johannes Zillmann
f93d1e4aa1 WIP initial quote/code detector with new TextItemCombiner 2017-02-18 10:50:54 +01:00
Johannes Zillmann
d78d9be8a3 WIP rename splitIntoBlocks to DetectPdfBlocks 2017-02-17 20:19:57 +01:00
Johannes Zillmann
767462bc9b WIP Introduce PdfBlockView
* Add vertical to horizontal transformation
* Improve header/footer removal
2017-02-17 20:17:04 +01:00