Skip to content

cszo/Epub2Chapters

Repository files navigation

Epub2Chapters

Convert EPUB books into clean, chapter-based, AI-friendly Markdown — right inside VS Code and Cursor.

Epub2Chapters turns any EPUB into a structured folder of per-chapter Markdown files, extracted images, and machine-readable indexes. The output is purpose-built for reading alongside an AI assistant: paste a paragraph, a whole chapter, or a cross-chapter range into chat and the AI will have the context it needs.

Features

  • Chapter-level splitting — one Markdown file per top-level TOC entry, even when the source EPUB packs everything into a single xhtml (anchor-aware slicing).
  • AI-ready frontmatter — each chapter starts with a YAML block (book, author, chapter, prev, next) so any snippet you copy carries its context.
  • In-book navigation — cross-references and footnotes are rewritten to the generated .md files; every chapter ends with a ← prev · 📖 TOC · next → footer.
  • Clean tables & figures — block-wrapped <p> cells are flattened to single-line GFM rows, colspan is preserved, and figure captions are merged into image alt text + italic captions.
  • Images extracted — all referenced images land under assets/images/ with relative links that preview correctly in the editor.
  • TOC pagestoc.md for humans, toc.json + book.json for tooling.
  • Pure-local, offline — no network calls, no telemetry. Your books stay on disk.

Requirements

  • VS Code (or Cursor) 1.85+
  • An EPUB 2 or EPUB 3 file (DRM-protected EPUBs are not supported)

Usage

  1. Open the Command Palette (⌘⇧P / Ctrl+Shift+P) and run “Epub2Chapters: Convert EPUB…”.
  2. Pick any .epub file, or right-click one in the Explorer and choose the same command.
  3. Wait for the progress notification — a typical 300-page technical book finishes in a second or two.
  4. Click Open TOC to jump straight to toc.md, or Open Folder to open the generated workspace in a new window.

Extension Settings

Setting Default Description
epub2chapters.outputLocation alongside Where to place the generated folder. alongside uses the EPUB's directory; workspace uses the current workspace root; ask prompts each time.
epub2chapters.includeImages true Extract and rewrite image references. Disable for text-only output.
epub2chapters.fileNamePattern {index}-{slug} Filename template for chapter files. Supports {index} and {slug}.
epub2chapters.splitLevel 1 TOC depth used when splitting. 1 = top-level chapters only (recommended). 2 = also split by second-level headings.

Output Structure

Running Epub2Chapters on AI Agents in Action.epub produces:

AI Agents in Action/
├── book.json                    # Machine-readable book metadata + stats
├── toc.md                       # Human-readable table of contents
├── toc.json                     # Programmatic TOC tree
├── chapters/
│   ├── 001-ai-agents-in-action.md
│   ├── 002-copyright.md
│   ├── …
│   └── 023-index.md
└── assets/
    └── images/
        ├── 1-1.png
        └── …

Each chapter file looks like:

---
book: "AI Agents in Action"
author: "Micheal Lanham"
chapter: 10
total: 23
title: "1 Introduction to agents and their world"
prev: "009-about-the-cover-illustration.md"
next: "011-2-harnessing-the-power-of-large-language-models.md"
---
# 1 Introduction to agents and their world

### This chapter covers

- Defining the concept of agents
- …

(chapter body)

---[About the cover illustration](009-about-the-cover-illustration.md) · [📖 TOC](../toc.md) · [2 Harnessing the power of large language models](011-2-harnessing-the-power-of-large-language-models.md)

Reading With AI

Because every chapter ships with frontmatter plus relative links to its neighbours, you can drop any of these into a chat with an AI assistant and it will understand the context:

  • A single paragraph — the frontmatter above it tells the model which book and chapter you're in.
  • A whole chapter file — great for summaries, Q&A, and follow-up questions.
  • A range of chapters — ask for cross-chapter comparisons or a study plan.
  • toc.md — for book-level questions ("give me a 10-minute overview").

Cursor / Copilot / Windsurf agents with filesystem access can also walk the folder and reason over the whole book.

Known Limitations

  • DRM-encrypted EPUBs are not supported.
  • Cross-chapter anchors (other.md#p8) open the right file but cannot auto-scroll to the paragraph — VS Code's Markdown preview only recognises heading slugs, not EPUB id attributes.
  • Complex tables with rowspan are kept as GFM single-line rows; row-merged cells are preserved only as column-count filler.
  • Footnote markers are rendered as ^1^ rather than proper Markdown footnote syntax.

Development

pnpm install            # install dependencies
pnpm run build          # bundle extension with esbuild
pnpm test               # unit tests (mocha + ts-node)
pnpm run test:e2e       # VS Code-host end-to-end tests (downloads VS Code on first run)

Open the repository in VS Code and press F5 to launch the Extension Development Host. The scripts/smoke.ts helper runs the core library against any EPUB from the command line:

pnpm dlx ts-node scripts/smoke.ts path/to/book.epub

License

MIT

About

Epub2Chapters converts EPUB books into clean, chapter-based, AI-friendly Markdown for reading, searching, and working with AI inside editors like VS Code and Cursor.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors