mirror of
https://github.com/jzillmann/pdf-to-markdown.git
synced 2024-12-26 16:28:53 +01:00
A PDF to Markdown converter
7abafc61e7
- sometimes a word is provided with multiple items. E.g: "T his is a sen tence" - use x-axis distance to not put whitespaces in the middle of a word - also tweak the line detection a bit (for Alice) |
||
---|---|---|
docs | ||
examples | ||
oldSrc | ||
patches | ||
src | ||
test | ||
.eslintrc.js | ||
.gitignore | ||
.prettierrc | ||
jest.config.js | ||
KNOWN_ISSUES.md | ||
LICENSE | ||
package-lock.json | ||
package.json | ||
README.md | ||
tsconfig.json |
PDF-To-Markdown Converter
Javascript library to convert PDF files into Markdown text. Online at http://pdf2md.morethan.io.
Major Changes
- Apr 2017 - 0.1: Initial Release
Use
//TBD
Contribute
Use the issue tracker and/or open pull requests!
Useful Build Commands
npm install
Download all necessary npm packagesnpm test
Run the testsnpm test -- --verbose=false './test/Files\.test\.ts' -t "Alice-In-Wonderland.pdf"
Run specific testnpm run test-write
Run the tests and persist possibly new changes on the example file resultsnpm run lint
Lint the javascript filesnpm run format
Run the prettier formatternpm run build
Compile the typescript files to thelib
folder
Release
//TBD
Test Release locally and use in other projects
npm link
in the core projectnpm link pdf-to-markdown-core
in the target project
Credits
pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser