From 165a12cfac017fa61ccd24215fcab58a49273eac Mon Sep 17 00:00:00 2001 From: calixteman Date: Mon, 22 Dec 2025 23:56:57 +0100 Subject: [PATCH] Add CLAUDE.md generated with claude code --- AGENTS.md | 204 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ CLAUDE.md | 1 + 2 files changed, 205 insertions(+) create mode 100644 AGENTS.md create mode 100644 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000000..acd7e480fc916 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,204 @@ +## Overview + +PDF.js is a Portable Document Format (PDF) viewer built with JavaScript, HTML5 Canvas, and CSS. It's a Mozilla project that provides a general-purpose, web standards-based platform for parsing and rendering PDFs without requiring native code or plugins. + +## Common Commands + +### Development Server +```bash +npx gulp server +``` +Then open http://localhost:8888/web/viewer.html to view the PDF viewer. Test PDFs are available at http://localhost:8888/test/pdfs/?frame + +### Building + +Build for modern browsers: +```bash +npx gulp generic +``` + +This generates `pdf.js` and `pdf.worker.js` in `build/generic/build/`. + +Build for distribution (creates pdfjs-dist package): +```bash +npx gulp dist +npx gulp dist-install # Build and install locally +``` + +### Testing + +Run all tests: +```bash +npx gulp test +``` + +Run unit tests only: +```bash +npx gulp unittest +``` + +Run integration tests (browser-based tests using Puppeteer): +```bash +npx gulp integrationtest +``` + +Run font tests: +```bash +npx gulp fonttest +``` + +Run a single test file by modifying test/test_manifest.json or using test runner options. + +### Linting and Formatting + +Lint JavaScript: +```bash +npx gulp lint +``` + +Format code (uses Prettier and ESLint): +```bash +npx eslint --fix +``` + +### Type Checking + +Run TypeScript type checking: +```bash +npx gulp typestest +``` + +## Architecture + +### High-Level Structure + +PDF.js has a multi-layer architecture that separates concerns between PDF parsing, rendering, and UI: + +#### 1. Core Layer (`src/core/`) +The core layer handles PDF parsing and interpretation. Key responsibilities: +- **PDF parsing**: Parsing PDF structure, cross-reference tables, streams +- **Font handling**: CFF, TrueType, Type1 font parsing and conversion (`font.js`, `fonts.js`, `cff_*.js`, `type1_*.js`) +- **Image decoding**: JPEG, JBIG2, JPX/JPEG2000 decoders +- **Operators**: Processing PDF drawing operators (`operator_list.js`, `evaluator.js`) +- **XFA Forms**: XML Forms Architecture support (`src/core/xfa/`) +- **Color spaces**: ICC profiles, device color spaces (`colorspace.js`, `icc_colorspace.js`) +- Runs in a Web Worker for performance isolation + +Entry point: `src/pdf.worker.js` + +#### 2. Display Layer (`src/display/`) +The display layer provides the API for rendering PDFs to canvas and managing documents. Key components: +- **API**: Main public API (`api.js`) - `PDFDocumentProxy`, `PDFPageProxy`, `getDocument()` +- **Canvas rendering**: Renders PDF operations to HTML5 canvas (`canvas.js`) +- **Text layer**: Extracts and positions text for selection/search (`text_layer.js`) +- **Annotation layer**: Renders and handles PDF annotations (`annotation_layer.js`) +- **Editor layer**: Supports PDF editing (annotations, highlights, stamps) (`editor/`) +- **Metadata**: Parses XMP metadata (`metadata.js`) +- **Streams**: Handles PDF data fetching (fetch, network, node) (`fetch_stream.js`, `network.js`, `node_stream.js`) + +Entry point: `src/pdf.js` + +#### 3. Scripting Layer (`src/scripting_api/`) +Implements JavaScript execution for interactive PDFs (form calculations, validations, button actions). +- Sandboxed execution environment +- Implements Acrobat JavaScript API objects (App, Doc, Field, etc.) + +Entry points: `src/pdf.scripting.js`, `src/pdf.sandbox.js` + +#### 4. Web Viewer (`web/`) +The complete PDF viewer application with UI. Key components: +- **Main app**: Application orchestration (`app.js`) +- **Viewer**: Page rendering and layout (`pdf_viewer.js`, `pdf_page_view.js`) +- **Toolbar**: Zoom, page navigation, print, download controls +- **Sidebar**: Thumbnails, outlines, attachments (`pdf_sidebar.js`, `pdf_thumbnail_view.js`, `pdf_outline_viewer.js`) +- **Find controller**: Text search functionality (`pdf_find_controller.js`) +- **Annotation editors**: UI for creating/editing annotations (`annotation_editor_layer_builder.js`) +- **Presentation mode**: Full-screen presentation (`pdf_presentation_mode.js`) + +Entry point: `web/viewer.html` + `web/viewer.mjs` + +#### 5. Shared Utilities (`src/shared/`) +Common utilities used across layers: +- **Message handling**: Worker communication (`message_handler.js`) +- **Utilities**: Common functions and constants (`util.js`) +- **Image utilities**: Image processing helpers (`image_utils.js`) + +### Worker Communication + +PDF.js uses a Web Worker architecture: +- Main thread (`display` layer) communicates with worker thread (`core` layer) via `MessageHandler` +- Keeps PDF parsing off the main thread for better performance +- Messages include: page rendering requests, text content extraction, metadata queries + +### Build System + +- Uses **Gulp** for build orchestration (`gulpfile.mjs`) +- **Webpack** bundles modules into browser-compatible formats +- **Babel** transpiles for browser compatibility (configurable targets in gulpfile) +- Preprocessor replaces build-time constants (e.g., `typeof PDFJSDev !== "undefined"` checks) +- Multiple build targets: generic, components, minified, legacy (older browser support) + +### External Dependencies + +Located in `external/`: +- **bcmaps**: Binary CMaps for CJK fonts +- **standard_fonts**: Core 14 PDF fonts metrics +- **cmapscompress**: Tools for compressing CMaps +- **openjpeg**: JPEG2000 decoder (WASM) +- **quickjs**: JavaScript engine for sandboxed execution + +### Translations + +Translations in `l10n/` are imported from Mozilla Firefox Nightly. Only the file l10n/en-US/viewer.ftl can be updated. + +## Development Notes + +### Adding New Features + +When adding features that span multiple layers: +1. Start with the `core` layer if parsing/interpretation changes are needed +2. Update the `display` layer API if new capabilities need exposure +3. Modify the `web` viewer if UI changes are required +4. Ensure worker communication handles new message types + +### Preprocessor Directives + +Code uses preprocessor checks for build-time conditionals: +```javascript +if (typeof PDFJSDev !== "undefined" && PDFJSDev.test("GENERIC")) { + // Generic build-specific code +} +``` + +Common flags: `GENERIC`, `MOZCENTRAL`, `CHROME`, `MINIFIED`, `TESTING`, `LIB`, `SKIP_BABEL`, `IMAGE_DECODERS` + +### Testing + +- Unit tests use Jasmine framework (`test/unit/`) +- Integration tests use Puppeteer for browser automation (`test/integration/`) +- Test PDFs downloaded from manifest (`test/test_manifest.json`) +- Reference images for visual regression testing (`test/ref/`) + +### Code Style + +- Uses ESLint with custom configuration (`eslint.config.mjs`) +- Prettier for formatting +- Stylelint for CSS +- No semicolons required (ASI enabled) +- Single quotes for strings + +### Pull Request Process + +- Keep PRs focused on a single issue +- Provide a test PDF if the issue is PDF-specific +- Ensure tests pass (`npx gulp test`) +- Run linting (`npx gulp lint`) +- Follow existing code patterns +- Don't modify translations directly (they come from Firefox) + +### Performance Considerations + +- Core parsing runs in a Web Worker - keep main thread work minimal +- Canvas rendering can be expensive - use appropriate scale factors +- Text layer generation is separate from rendering - can be deferred +- Annotation layer is optional - only enable when needed diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000000000..43c994c2d3617 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +@AGENTS.md