Skip to content

Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing.

Notifications You must be signed in to change notification settings

ergut/md-hierarchy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

md-hierarchy logo

md-hierarchy

A CLI tool that splits markdown files into hierarchical folder structures based on heading levels, and can reconstruct the original markdown from the split pieces.

Features

  • Split markdown files into navigable folder hierarchies
  • Merge folder structures back into single markdown files
  • Preserves all markdown elements (code blocks, lists, tables, links, etc.)
  • Handles edge cases (duplicate headings, empty headings, skipped levels)
  • Round-trip compatible (split → merge produces equivalent content)
  • Dry-run mode to preview operations

Installation

# From PyPI
pip install md-hierarchy

# From source
pip install -e .

# With development dependencies
pip install -e ".[dev]"

Usage

Split Command

Split a markdown file into a hierarchical folder structure:

md-hierarchy split input.md output_dir --level 3

Options:

  • --level, -l: Heading level to extract as files (1-4, default: 3)
  • --overwrite: Overwrite output directory if it exists
  • --verbose, -v: Print detailed operation log
  • --dry-run: Show what would be done without writing files

Example:

# Split at level 3 (H3 headings become files)
md-hierarchy split proposal.md ./output --level 3

# Split with overwrite
md-hierarchy split proposal.md ./output --level 2 --overwrite

# Preview without creating files
md-hierarchy split proposal.md ./output --dry-run

Merge Command

Merge a folder structure back into a single markdown file:

md-hierarchy merge input_dir output.md

Options:

  • --verbose, -v: Print detailed operation log

Example:

# Merge folder structure
md-hierarchy merge ./output merged.md

# Merge with verbose output
md-hierarchy merge ./split-docs final.md --verbose

Output Structure

When splitting at level 3, the tool creates this structure:

output-dir/
├── 00-__frontmatter__.md            # Content before first heading (if exists)
├── 01-Introduction/
│   ├── 00-__intro__.md              # H1 heading + intro content (always created)
│   ├── 01-Background/
│   │   ├── 00-__intro__.md          # H2 heading + intro content (always created)
│   │   ├── 01-Problem-Statement.md  # H3 section
│   │   └── 02-Research-Gap.md       # H3 section
│   └── 02-Objectives/
│       ├── 00-__intro__.md          # H2 heading (even if no intro content)
│       └── 01-Primary-Goals.md
└── 02-Methodology/
    └── 00-__intro__.md              # H1 heading + content

File Naming Convention

  • Folders: NN-Sanitized-Title/ (e.g., 01-Introduction/)
  • Intro files: 00-__intro__.md (always created for every heading folder)
  • Frontmatter: 00-__frontmatter__.md (at root, only if content exists before first heading)
  • Section files: NN-Sanitized-Title.md (e.g., 01-Problem-Statement.md)
  • Numbers are zero-padded (01, 02, ..., 99)
  • Special characters (/ \ : * ? " < > |) are removed
  • Spaces are replaced with hyphens
  • Maximum length: 50 characters

Key Design Decisions

  • 00-__intro__.md is always created for every heading folder, even if empty
    • This provides a consistent structure and an easy place to add intro text later
    • Contains the heading declaration and any content before child sections
  • The 00- prefix ensures intro files sort first in directory listings
  • The __intro__ naming (double underscore) clearly marks these as special/meta files
  • Frontmatter files are created at the root only when pre-heading content exists

Edge Cases Handled

  1. Empty headingsUntitled-Section-N
  2. Duplicate titles → Append -2, -3, etc.
  3. Skipped levels (H1 → H3) → Insert 00-Content/ folder
  4. Content before first heading00-__frontmatter__.md at root
  5. Heading attributes (e.g., {#id .class}) → Preserved in content
  6. Headings with no intro content00-__intro__.md still created (with just the heading)

Round-Trip Compatibility

The tool is designed for round-trip operations:

# Split
md-hierarchy split original.md ./split --level 3

# Merge
md-hierarchy merge ./split reconstructed.md

# Content should be equivalent
diff original.md reconstructed.md

Development

Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install in development mode
pip install -e ".[dev]"

Run Tests

pytest

Run Tests with Coverage

pytest --cov=md_hierarchy --cov-report=html

Requirements

  • Python 3.8+
  • Dependencies:
    • markdown-it-py - Markdown parsing
    • click - CLI framework

License

MIT

About

Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages