Johannes Zillmann
7abafc61e7
Improve word boundary detection
...
- sometimes a word is provided with multiple items. E.g: "T his is a sen tence"
- use x-axis distance to not put whitespaces in the middle of a word
- also tweak the line detection a bit (for Alice)
2024-05-20 00:22:24 -06:00
Johannes Zillmann
5daa8aa45a
Detect Footnotes
...
- not yet converted in MD
- detection should be same as old version
2024-04-09 08:25:27 -06:00
Johannes Zillmann
3c31c12768
Update known issues
2024-03-28 12:03:49 -06:00
Johannes Zillmann
e261583c65
Improve TOC headline detection
2021-04-27 08:29:00 +02:00
Johannes Zillmann
94a7405671
Lookup and verify toc links
2021-04-25 14:41:50 +02:00
Johannes Zillmann
5365667314
Reviewing new PDFs
2021-04-18 11:56:42 +02:00
Johannes Zillmann
baa5b4fadc
Add 6 more test PDFs
2021-04-18 11:34:11 +02:00
Johannes Zillmann
ce6c9fe977
Initial TOC detection
2021-04-12 08:09:30 +02:00
Johannes Zillmann
932a79a3e9
Add known issues
2021-04-11 09:08:45 +02:00