Is scanning PDF too painful? pdf-craft converts to Markdown/EPUB in seconds, automatically generates catalog notes and citation alignment

Written by
Silas Grey
Updated on:June-28th-2025
Recommendation

PDF-Craft is here to convert PDF files to Markdown or EPUB with one click, automatically generating catalog annotations and citation alignment.

Core content:
1. Introduction and installation of PDF-Craft
2. Convert PDF files to Markdown format
3. Convert PDF files to EPUB format

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

This official account mainly focuses on cutting-edge AI technologies such as NLP, CV, LLM, RAG, Agent, etc., and shares industry practical cases and courses for free to help you fully embrace AIGC.


PDF-Craft in Action

  • Convert PDF to MarkDown
from pdf_craft import PDFPageExtractor, MarkDownWriter

extractor = PDFPageExtractor(
  device= "cpu"# If you want to use CUDA, please change to device="cuda:0" format.
  model_dir_path= "/path/to/model/dir/path"# The folder address where the AI ​​model is downloaded and installed
)
with MarkDownWriter(markdown_path,  "images""utf-8" ) as md:
  for  block  in  extractor.extract(pdf= "/path/to/pdf/file" ):
    md.write(block)

If there were figures (or tables, formulas) in the original PDF, a table of contents will be created at the same level as the saved images.

The images in the directory will be referenced in the MarkDown file as relative addresses.*.md``assets``*.md``assets

  • Convert PDF to EPUB

    First create a PDF extraction object

extractor = PDFPageExtractor(
  device= "cpu"# If you want to use CUDA, please change to device="cuda:0" format.
  model_dir_path= "/path/to/model/dir/path"# The folder address where the AI ​​model is downloaded and installed
)

Send the extracted content to LLM to generate EPUB file

from pdf_craft import analyse
from pdf_craft import LLM

llm = LLM(
  key = "sk-XXXXX"# key provided by the LLM vendor
  url = "https://api.DeepSeek.com"# URL provided by the LLM provider
  model = "deepseek-chat"# Model provided by LLM vendor
  token_encoding = "o200k_base"# Local model name for tokens estimation (not related to LLM, if you don't care, keep "o200k_base")
)

analyse(
  llm=llm,  # LLM configuration prepared in the previous step
  pdf_page_extractor=pdf_page_extractor,  # The PDFPageExtractor object prepared in the previous step
  pdf_path = "/path/to/pdf/file"# PDF file path
  analysing_dir_path = "/path/to/analysing/dir"# analysing folder address
  output_dir_path= "/path/to/output/files"# The analysis results will be written to this folder
)
  • output_dir_path, indicating the folder where the results of the scan and analysis (there will be multiple files) should be saved.

  • analysing_dir_path, used to store intermediate states during the analysis process.

  • After the analysis is completed, output_dir_path The folder address is passed to the following code as a parameter to generate the EPUB file.

PDF-Craft main functions:

  • Convert PDF to Markdown using local AI models without internet connection
  • Supports converting PDF to structured EPUB e-book format
  • Intelligently identify and filter interference elements such as headers, footers, footnotes, page numbers, etc.
  • Automatically process charts and formulas and keep them in the converted file as images
  • Combine LLM technology to build book structure and generate EPUB with table of contents and chapters

PDF-craft conversion logic

First, split the PDF pages into images

Secondly, use DocLayout-YOLO to identify block elements in the image, including: headers, footers, paragraphs, titles, pictures, tables, charts, page numbers and other information

Then, use layoutreader to sort the blocks

Next, use OnnxOCR to recognize the text in the block

Finally, the text recognized by OCR is sent to Deepseek, and the structure of the book is constructed through specific information (such as the table of contents), and finally an EPUB file with a table of contents and chapters is generated.

During this parsing and building process, the notes and reference information for each page are read through LLM and then presented in a new format in the EPUB file.