Johannes Zillmann
7abafc61e7
Improve word boundary detection
...
- sometimes a word is provided with multiple items. E.g: "T his is a sen tence"
- use x-axis distance to not put whitespaces in the middle of a word
- also tweak the line detection a bit (for Alice)
2024-05-20 00:22:24 -06:00
Johannes Zillmann
616909481a
Don't print globals twice
2021-07-18 14:13:38 -06:00
Johannes Zillmann
46234417ad
Fine tune line detection
...
* Before lines where assembled that really separate lines
2021-07-18 13:07:06 -06:00
Johannes Zillmann
e261583c65
Improve TOC headline detection
2021-04-27 08:29:00 +02:00
Johannes Zillmann
94a7405671
Lookup and verify toc links
2021-04-25 14:41:50 +02:00
Johannes Zillmann
a427806f68
Move width & height after x & y
2021-04-11 18:28:53 +02:00
Johannes Zillmann
388e8cc6b1
Find page mapping during statistics calculation
2021-03-28 23:45:26 +02:00
Johannes Zillmann
89d4bbd2f9
Cover globals in tests
2021-03-28 10:58:24 +02:00
Johannes Zillmann
f42358d63b
Remove empty items
2021-03-16 05:50:57 +01:00
Johannes Zillmann
60596e7416
#24 Add first external PDFs for testing
2021-03-13 22:53:54 +01:00