Commit Graph

69 Commits

Author SHA1 Message Date
Johannes Zillmann
a427806f68 Move width & height after x & y 2021-04-11 18:28:53 +02:00
Johannes Zillmann
c8cfbebb92 Disable no used locals (for now) 2021-04-10 09:18:46 +02:00
Johannes Zillmann
ddac96299d Fix not used locals 2021-04-10 09:18:46 +02:00
Johannes Zillmann
642509a454 Refine repetitive character removal 2021-04-02 22:33:12 +02:00
Johannes Zillmann
6283ab7a96 Track evaluation score (optionally)
Makes it easier to see how a value got classified
2021-04-01 18:16:42 +02:00
Johannes Zillmann
d8fb3e0b24 Rename CalculateCoordinate to Unwrap... cause thats what its really is 2021-03-31 10:08:05 +02:00
Johannes Zillmann
487b304c15 Move EvaluationIndex to debug package
(Not a 100% correct but somewhat more pleasing)
2021-03-31 10:00:21 +02:00
Johannes Zillmann
ce6d9f8984 Package refactoring: Move globals to root 2021-03-29 08:57:05 +02:00
Johannes Zillmann
71ef84153c Show page labels + default mapping to 1 2021-03-29 08:47:04 +02:00
Johannes Zillmann
898af7bbc8 Fix previous commit and re-use page mapping 2021-03-29 07:24:20 +02:00
Johannes Zillmann
388e8cc6b1 Find page mapping during statistics calculation 2021-03-28 23:45:26 +02:00
Johannes Zillmann
89d4bbd2f9 Cover globals in tests 2021-03-28 10:58:24 +02:00
Johannes Zillmann
d7d3502a25 Fix processing pdfs with no page numbers 2021-03-28 10:21:26 +02:00
Johannes Zillmann
202da9b005 Globals propagation infrastructure 2021-03-27 09:35:18 +01:00
Johannes Zillmann
21106d7e5e Lower min score since accuracy has increased 2021-03-26 09:02:31 +01:00
Johannes Zillmann
0b096faa0c More accurate page number detection 2021-03-26 08:42:31 +01:00
Johannes Zillmann
4340acb758 Simplify code 2021-03-24 23:08:36 +01:00
Johannes Zillmann
ab40466ca8 Filter out some impossible page numbers 2021-03-24 22:27:59 +01:00
Johannes Zillmann
a6a21c9ed2 simplify code (and keep information) through flattening page lines 2021-03-24 07:45:19 +01:00
Johannes Zillmann
4c77274d16 Fix tests 2021-03-23 08:46:14 +01:00
Johannes Zillmann
4d1821f584 Qualify lines for removal based on multiple scores 2021-03-23 08:08:13 +01:00
Johannes Zillmann
0be95e4bbc Track evaluations 2021-03-23 07:25:17 +01:00
Johannes Zillmann
c98145a63c Test for remote PDFS 2021-03-22 09:03:26 +01:00
Johannes Zillmann
f5a180113d No unused locals 2021-03-21 08:39:42 +01:00
Johannes Zillmann
17290cf746 Improve removal
* Always compare in one direction
2021-03-21 08:09:41 +01:00
Johannes Zillmann
d7202e7542 Add forgotton line 2021-03-20 19:04:15 +01:00
Johannes Zillmann
65e17f2c4a Fix compilation caused by jest-file-snapshot problem 2021-03-20 19:01:30 +01:00
Johannes Zillmann
68c4d9a4a3 Consolidate repetitive element eviction
* Solely rely on neighbour similarity
* Cut out `y` in the middle
2021-03-16 07:02:31 +01:00
Johannes Zillmann
f42358d63b Remove empty items 2021-03-16 05:50:57 +01:00
Johannes Zillmann
5af033c0f1 Round and limit y 2021-03-15 20:37:41 +01:00
Johannes Zillmann
a90e6207dc Add similarity checks to repetitive element removal 2021-03-15 09:16:50 +01:00
Johannes Zillmann
9bd5043f2e Very basic removal of repetitive elements 2021-03-14 12:15:37 +01:00
Johannes Zillmann
77b7d837eb Improve change detection to handle removal case properly 2021-03-14 11:59:46 +01:00
Johannes Zillmann
d5523fb1d4 Split result files
* Due 100 MB limit of Github
2021-03-13 22:46:10 +01:00
Johannes Zillmann
713a82b41d Stabilize font display in tests
* If multiple PDF are tested after another their font ids change (e.g. `g_d0_f1` becomes `g_d1_f1`)
2021-03-13 19:38:47 +01:00
Johannes Zillmann
417cc2ab94 Add Test infrastructure for example PDFs 2021-03-13 08:46:22 +01:00
Johannes Zillmann
45355a9315 PageControls 2021-03-09 08:44:06 +01:00
Johannes Zillmann
c60bd3f737 Un-Grouping switch 2021-03-01 23:42:02 +01:00
Johannes Zillmann
163d34261a Display Font Tooltip 2021-02-28 12:56:23 +01:00
Johannes Zillmann
37c50be4ca stage description tooltip 2021-02-28 10:23:59 +01:00
Johannes Zillmann
a99b031bc6 ShowAll Marker for transformer stages 2021-02-28 02:18:47 +01:00
Johannes Zillmann
e7574513c5 Change detection on group and item level 2021-02-28 02:07:45 +01:00
Johannes Zillmann
229cb53eb0 Make LineItemMerger standalone and re-usable 2021-02-27 18:45:14 +01:00
Johannes Zillmann
cd8cdf4df6 Highlight changes 2021-02-27 09:51:04 +01:00
Johannes Zillmann
915827be0c Sort line items on X axis 2021-02-26 21:42:26 +01:00
Johannes Zillmann
08509953dc Fix line compaction for multi-columnar PDFs 2021-02-26 19:28:44 +01:00
Johannes Zillmann
6e5e5c9d53 Improve line compaction 2021-02-26 18:04:50 +01:00
Johannes Zillmann
0910f7b148 Grouping of line items 2021-02-21 13:23:31 +01:00
Johannes Zillmann
d8bc6d100b Cleanup & simple line detection 2021-02-21 08:23:51 +01:00
Johannes Zillmann
71fb6a23ff Cleanup 2021-02-20 19:36:43 +01:00