Commit Graph

10 Commits

Author SHA1 Message Date
Johannes Zillmann
7abafc61e7 Improve word boundary detection
- sometimes a word is provided with multiple items. E.g: "T his is a sen tence"
- use x-axis distance to not put whitespaces in the middle of a word
- also tweak the line detection a bit (for Alice)
2024-05-20 00:22:24 -06:00
Johannes Zillmann
616909481a Don't print globals twice 2021-07-18 14:13:38 -06:00
Johannes Zillmann
46234417ad Fine tune line detection
* Before lines where assembled that really separate lines
2021-07-18 13:07:06 -06:00
Johannes Zillmann
e261583c65 Improve TOC headline detection 2021-04-27 08:29:00 +02:00
Johannes Zillmann
94a7405671 Lookup and verify toc links 2021-04-25 14:41:50 +02:00
Johannes Zillmann
a427806f68 Move width & height after x & y 2021-04-11 18:28:53 +02:00
Johannes Zillmann
388e8cc6b1 Find page mapping during statistics calculation 2021-03-28 23:45:26 +02:00
Johannes Zillmann
89d4bbd2f9 Cover globals in tests 2021-03-28 10:58:24 +02:00
Johannes Zillmann
f42358d63b Remove empty items 2021-03-16 05:50:57 +01:00
Johannes Zillmann
60596e7416 #24 Add first external PDFs for testing 2021-03-13 22:53:54 +01:00