An MCP server that gives AI agents tools to read, search, and extract figures from PDF files. Built in Rust on MuPDF for speed.
brew tap maxhodak/pdf-mcp https://github.com/maxhodak/pdf-mcp
brew install pdf-mcp
Requires a C compiler (Xcode CLI tools on macOS, build-essential on Linux).
cargo install --path .
Download from Releases for macOS (x86_64, aarch64) and Linux (x86_64).
{
"mcpServers": {
"pdf": {
"command": "pdf-mcp"
}
}
}Every tool takes a path argument pointing to any PDF on disk. The server caches the last opened document so repeated calls to the same file skip re-parsing.
All coordinates are in PDF points (72pt = 1 inch, origin at top-left).
Returns page count, title, author, subject, keywords, creator, producer, and per-page dimensions.
Extracts all text from a page.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
Finds all occurrences of a string and returns bounding boxes.
| Param | Required | Description |
|---|---|---|
query |
yes | Text to search for |
page |
no | Restrict to a single page |
Renders a page or region as PNG/JPEG. Returns base64 inline or writes to disk.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
dpi |
no | Render resolution (default 150) |
x0, y0, x1, y1 |
no | ROI crop in PDF points. Omit for full page. |
width |
no | Resize to this width (preserves aspect ratio) |
height |
no | Resize to this height (preserves aspect ratio) |
format |
no | "png" (default) or "jpeg" |
quality |
no | JPEG quality 1-100 (default 80) |
output_path |
no | Write to this file instead of returning inline |
When output_path is set, the response is a JSON summary:
{"path": "/tmp/fig.jpg", "format": "jpeg", "width": 1200, "height": 800, "bytes": 94521}Returns text and image blocks with bounding boxes. Useful for understanding page layout before extracting specific regions.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
Extracts text from a rectangular region of a page.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
x0, y0, x1, y1 |
yes | Region bounds in PDF points |
Returns the table of contents / bookmarks as a tree with page numbers.
Returns all hyperlinks on a page with bounding boxes and URIs.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
Returns a 1D luminance profile along the vertical or horizontal axis. Each value (0-255) maps to one PDF point. Use to find content bounds and whitespace margins for smart cropping.
| Param | Required | Description |
|---|---|---|
page |
yes | Page number (0-indexed) |
axis |
yes | "vertical" (top-to-bottom) or "horizontal" (left-to-right) |
offset |
no | Sample at this perpendicular offset in PDF points |
band_width |
no | Width of band to average around offset (default 10 if offset given) |
threshold |
no | Luminance below this = content (default 250) |
Returns the profile array plus auto-detected content_start and content_end in PDF points.
get_info— learn page count and dimensionsget_outline— understand document structureget_page_text— read contentsearch— find specific text, get bounding boxesget_text_blocks— understand layout, find figures/tablesrender— inspect a region visually or extract a figurerenderwithoutput_path— save the final image to disk
AGPL-3.0 (inherited from MuPDF)