mirror of
https://github.com/jzillmann/pdf-to-markdown.git
synced 2025-02-08 05:49:29 +01:00
A PDF to Markdown converter
- Convert the `example PDFs` with the old `pdf-to-markdown` and write them to text files - Compare the text files with the conversion of the current code - Next: - Improve the current code to match good conversions of the old code - Adapt the text files in case the current conversion is better than the old - Current tests are breaking |
||
---|---|---|
docs | ||
examples | ||
oldSrc | ||
patches | ||
src | ||
test | ||
.eslintrc.js | ||
.gitignore | ||
.prettierrc | ||
jest.config.js | ||
KNOWN_ISSUES.md | ||
LICENSE | ||
package-lock.json | ||
package.json | ||
README.md | ||
tsconfig.json |
PDF-To-Markdown Converter
Javascript library to convert PDF files into Markdown text. Online at http://pdf2md.morethan.io.
Major Changes
- Apr 2017 - 0.1: Initial Release
Use
//TBD
Contribute
Use the issue tracker and/or open pull requests!
Useful Build Commands
npm install
Download all necessary npm packagesnpm test
Run the testsnpm test -- --verbose=false './test/Files\.test\.ts' -t "Alice-In-Wonderland.pdf"
Run specific testnpm run test-write
Run the tests and persist possibly new changes on the example file resultsnpm run lint
Lint the javascript filesnpm run format
Run the prettier formatternpm run build
Compile the typescript files to thelib
folder
Release
//TBD
Test Release locally and use in other projects
npm link
in the core projectnpm link pdf-to-markdown-core
in the target project
Credits
pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser