Skip to content

Review of PR #1: CommonMark ipynb export + image attachment embedding #10

@mmcky

Description

@mmcky

Review of PR #1

I have reviewed the PR CommonMark ipynb export + image attachment embedding in full.

Overall Assessment

This is an excellent, well-architected PR. The implementation of the commonmark AST pre-transform is very clean, and the two-phase approach for image attachments (collecting data in the CLI, rewriting in the serializer) is a smart way to keep the core myst-to-ipynb package pure and decoupled from the filesystem. The test coverage is also fantastic.

Areas for Improvement & Edge Cases

1. The Regex Escaping Bug (Fixed)

In packages/myst-to-ipynb/src/attachments.ts, the regex used to find and replace images in the generated markdown would fail on escaped characters. If an image has brackets in its alt text or parentheses in its url, the markdown serializer (mdast-util-to-markdown) escapes them (e.g., ![alt \[text\]](url\(1\))). The original regex /[^\]]*/ stopped at the very first ], even if it was escaped.

Fix applied: Updated the regex to handle escaped brackets and parentheses, and unescaped the URL before looking it up in the imageData dictionary.

2. Fragile String Replacement in embedImagesAsAttachments (Fixed)

The original replacement logic used a loop with .replace(string, string), which only replaces the first occurrence. If a user included the exact same image twice in the same cell, it worked but was slightly fragile.

Fix applied: Refactored the replacement logic to use a single pass with md.replace(imgRegex, callback).

3. Synchronous File Reading in collectImageData

In packages/myst-cli/src/build/ipynb/index.ts, fs.readFileSync(filePath) is used. While this is fine for small documents, if a user is exporting a massive notebook with hundreds of high-res images, synchronous reads will block the Node.js event loop.

Recommendation: Since runIpynbExport is already an async function, consider making collectImageData async and using await fs.promises.readFile(filePath). You could even use Promise.all to read all images concurrently, which would significantly speed up the export for image-heavy notebooks.

4. AST Mutation in transformToCommonMark

In packages/myst-to-ipynb/src/commonmark.ts, the function modifies the AST in place (tree.children = newChildren;). Because a deep clone (JSON.parse(JSON.stringify(blockTree))) is passed in index.ts, this doesn't cause side effects in the current implementation. However, mutating ASTs in place can be a footgun if this function is ever reused elsewhere.

Recommendation: It's perfectly fine to leave as-is given the deep clone, but standard unified/remark ecosystem practice is to use a utility like unist-util-map or unist-util-visit to return a new tree without mutating the original.

Summary

The PR is in great shape and the logic for lifting code cells out of gated directives (liftCodeCellsFromGatedNodes) is particularly elegant. With the regex edge-case fixed, this is ready to merge! 🚀

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions