pdf-to-markdown/README.md
Johannes Zillmann 55ae236928 Improve header detection
- fix tests
- still run header detection based on heights even if TOC headlines have been identified
2024-03-28 11:39:34 -06:00

1.1 KiB

PDF-To-Markdown Converter

Javascript library to convert PDF files into Markdown text. Online at http://pdf2md.morethan.io.

Major Changes

  • Apr 2017 - 0.1: Initial Release

Use

//TBD

Contribute

Use the issue tracker and/or open pull requests!

Useful Build Commands

  • npm install Download all necessary npm packages
  • npm test Run the tests
  • npm test -- --verbose=false './test/Files\.test\.ts' -t "Alice-In-Wonderland.pdf" Run specific test
  • npm run test-write Run the tests and persist possibly new changes on the example file results
  • npm run lint Lint the javascript files
  • npm run format Run the prettier formatter
  • npm run build Compile the typescript files to the lib folder

Release

//TBD

Test Release locally and use in other projects

  • npm link in the core project
  • npm link pdf-to-markdown-core in the target project

Credits

pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser