pdf-to-markdown/examples/Made-with-cc
Johannes Zillmann 7abafc61e7 Improve word boundary detection
- sometimes a word is provided with multiple items. E.g: "T his is a sen tence"
- use x-axis distance to not put whitespaces in the middle of a word
- also tweak the line detection a bit (for Alice)
2024-05-20 00:22:24 -06:00
..
adjustHeights.json Add 6 more test PDFs 2021-04-18 11:34:11 +02:00
calculateStatistics.json Improve header detection 2024-03-28 11:39:34 -06:00
compactLines.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectBlocks.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectCodeBlocks.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectFontStyles.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectFootnotes.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectHeaders.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectLinks.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectListItems.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectListLevels.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
detectTOC.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
removeEmptyItems.json Add 6 more test PDFs 2021-04-18 11:34:11 +02:00
removeRepetitiveItems.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
sortbyX.json Improve word boundary detection 2024-05-20 00:22:24 -06:00
unwrapCoordinates.json Add 6 more test PDFs 2021-04-18 11:34:11 +02:00