diff --git a/KNOWN_ISSUES.md b/KNOWN_ISSUES.md index 8b0a909..80ff5f2 100644 --- a/KNOWN_ISSUES.md +++ b/KNOWN_ISSUES.md @@ -12,15 +12,33 @@ The interesting thing is that rendering with pdfjs (online) looks good. So maybe - CC-NC_Leitfaden.pdf: un-verified toc entries (and/und/&... etc...) - Closed-Syllables.pdf: unverified toc entries - Safe-Communication.pdf: One toc element is one page off (8=>9) +- no page numbers [The-Art-of-Public-Speaking](examples/The-Art-of-Public-Speaking.pdf). +- multiline headlines: [WoodUp](examples/WoodUp.pdf) +- Detecting list of figures (and creating headlines) [Achieving-The-Paris-Climate-Agreement](Achieving-The-Paris-Climate-Agreement.pdf) ## Not yet reviewed test PDFS -- Achieving-The-Paris-Climate-Agreement.pdf - - wrong page page mapping ? - - no page numbers removed - - no toc -- Made-with-cc.pdf - - no toc -- Watered-Soul-Blog-Book.pdf - - TOC: character minumum cuts out year - - TOC: stops to early +# Achieving-The-Paris-Climate-Agreement.pdf + +- wrong page page mapping ? +- no page numbers removed +- no toc +- romisch numbers are wrong +- subheading under the toc headings should be detected as well (clearly not in the code) + +# Sherlock + +- words not together + +# Made-with-cc.pdf + +- no toc + +# Watered-Soul-Blog-Book.pdf + +- TOC: character minumum cuts out year +- TOC: stops to early + +# Life of God in Soul of man + +- Headlines confusion (after the headline the first words of a sentence are big... shouldn't be a headline in this case... looks at all heights in the line) diff --git a/examples/KNOWN_ISSUES.md b/examples/KNOWN_ISSUES.md deleted file mode 100644 index d5b66be..0000000 --- a/examples/KNOWN_ISSUES.md +++ /dev/null @@ -1,14 +0,0 @@ -# Known Issues - -## Missing or wrong characters - -The text which comes of pdfjs looks very erronous sometimes. E.g [Life-Of-God-In-Soul-Of-Man](examples/Life-Of-God-In-Soul-Of-Man.pdf). -The interesting thing is that rendering with pdfjs (online) looks good (but copying the text shows the same distortion). So maybe this is just a setup problem !? - -## Uncovered TOC variants - -- out of order items [Safe-Communication](examples/Safe-Communication.pdf) -- items in wrong lines + numbers are not numbers [Life-Of-God-In-Soul-Of-Man](examples/Life-Of-God-In-Soul-Of-Man.pdf) -- no page numbers [The-Art-of-Public-Speaking](examples/The-Art-of-Public-Speaking.pdf). -- multiline headlines: [WoodUp](examples/WoodUp.pdf) -- Detecting list of figures (and creating headlines) [Achieving-The-Paris-Climate-Agreement](Achieving-The-Paris-Climate-Agreement.pdf)