Johannes Zillmann
|
55ae236928
|
Improve header detection
- fix tests
- still run header detection based on heights even if TOC headlines have been identified
|
2024-03-28 11:39:34 -06:00 |
|
Johannes Zillmann
|
46234417ad
|
Fine tune line detection
* Before lines where assembled that really separate lines
|
2021-07-18 13:07:06 -06:00 |
|
Johannes Zillmann
|
e261583c65
|
Improve TOC headline detection
|
2021-04-27 08:29:00 +02:00 |
|
Johannes Zillmann
|
94a7405671
|
Lookup and verify toc links
|
2021-04-25 14:41:50 +02:00 |
|
Johannes Zillmann
|
a427806f68
|
Move width & height after x & y
|
2021-04-11 18:28:53 +02:00 |
|
Johannes Zillmann
|
6283ab7a96
|
Track evaluation score (optionally)
Makes it easier to see how a value got classified
|
2021-04-01 18:16:42 +02:00 |
|
Johannes Zillmann
|
898af7bbc8
|
Fix previous commit and re-use page mapping
|
2021-03-29 07:24:20 +02:00 |
|
Johannes Zillmann
|
388e8cc6b1
|
Find page mapping during statistics calculation
|
2021-03-28 23:45:26 +02:00 |
|
Johannes Zillmann
|
89d4bbd2f9
|
Cover globals in tests
|
2021-03-28 10:58:24 +02:00 |
|
Johannes Zillmann
|
4d1821f584
|
Qualify lines for removal based on multiple scores
|
2021-03-23 08:08:13 +01:00 |
|
Johannes Zillmann
|
c98145a63c
|
Test for remote PDFS
|
2021-03-22 09:03:26 +01:00 |
|