feat: diffing extension for comparing documents (SD-1324 and SD-89) by luccas-harbour · Pull Request #2306 · superdoc-dev/superdoc

luccas-harbour · 2026-03-05T14:46:53Z

Summary

This PR delivers an end-to-end document compare + replay workflow, including comment
diff/replay and tracked-changes integration.

What’s Included

Adds a full diffing extension in @superdoc/super-editor with compareDocuments and
replayDifferences commands.
Implements document diff computation across block, paragraph, inline text, inline
nodes, attrs, marks, comments, styles and numbering properties.
Implements replay engine for paragraph/non-paragraph/inline/comment/styles/numbering diffs.
Adds tracked-changes-aware replay behavior:
- applyTrackedChanges support in replay.
- Single-transaction replay path (avoids lost replay steps).
Improves diff/replay correctness:
- Preserves duplicate same-type marks (e.g. overlapping comment marks).
- Applies inline run-attribute diffs for modified text ranges.
- Matches inline node types by name across different editor/schema instances.
- Fixes insertion anchor computation for depth transitions in tree diffs.
- Preserves multi-block comment body edits in comment diffing.
Improves multi-document comment safety in superdoc:
- Scopes replay update/delete handling to active document context.
- Uses imported-id-aware identity matching where needed.
- Deletes full reply subtree for thread removals.
- Avoids replay-driven active-thread flicker by syncing active state only when
  explicitly requested.
Improves tracked-change comment resync/pruning:
- Rebuilds after replay completion.
- Prunes stale tracked-change threads only for active document.
- Uses both commentId and importedId to avoid false prune/duplication.
Improves DOCX comment fidelity on replay/export:
- Carries document identity (documentId/fileId) in replay comment payloads.
- Preserves structured comment bodies (elements ↔ docxCommentJSON) on add/update.
- Ensures updated docxCommentJSON is reflected by getValues() for export.
- Applies isDone fallback to resolved fields when replay payload omits explicit
  resolved metadata.

Tests

Adds/updates extensive test coverage for:
- diff algorithms (attributes, inline, paragraph, generic, comment, sequence,
  computeDiff).
- replay modules (replay-inline, replay-paragraph, replay-non-paragraph, replay-
  comments, replay-attrs, marks-from-diff).
- integration fixture tests in replayDiffs.test.js.
- superdoc comment/store behavior (SuperDoc.test.js, comments-store.test.js, use-
  comment.test.js).
Adds fixture corpus for replay scenarios (diff_before*.docx / diff_after*.docx,
including additional cases up to 11).
Validated locally: pnpm --filter super-editor exec vitest run src/extensions/diffing/
replayDiffs.test.js (passing).

Notes

One known replay limitation remains intentionally accepted for now: preserving non-
mark run attrs on some inline text additions. This can be addressed once we stop using marks for formatting and use run properties directly.

linear · 2026-03-05T14:46:57Z

SD-1324 Diffing method to convert the differences into tracked changes

SD-89 #1 Feature: Document diffing

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b926bfd1d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

packages/superdoc/src/SuperDoc.vue

caio-pizzol

@luccas-harbour — solid approach overall, and the test coverage for the new diffing extension is impressive (co-located tests for every algorithm and replay module, plus end-to-end coverage).

Correctness: two things to look at — heading text changes can be missed by the diff, and tracked-change comment IDs can collide across documents. one question about the run properties skip when marks change.

DX: small cleanup opportunity with duplicated helpers.

Tests: coverage is strong. only minor gaps around the belongsToDocument legacy fallback in the comments store.

left a few inline comments with details.

packages/super-editor/src/extensions/diffing/algorithm/generic-diffing.ts

packages/superdoc/src/stores/comments-store.js

packages/super-editor/src/extensions/diffing/replay/replay-inline.ts

packages/super-editor/src/extensions/diffing/replay/marks-from-diff.ts

luccas-harbour · 2026-03-11T19:33:59Z

Note: no additional tweaks came up in feedback session.

…sitions

This function can then be reused when diffing paragraphs and runs. It helps identifying modifications instead of delete/insert pairs

Always maps starting/ending positions to the old document instead of the new one.

…tity fallback - resolve delete targets from both importedId and commentId (importedId prioritized) - seed subtree removal with all resolved ids to match re-keyed replay comments - keep active comment clearing aligned with the expanded removal set - add regression test for stale runtime commentId + stable importedId delete payloads to prevent orphaned sidebar threads

…nsactions - set preventDispatch meta in compareDocuments so CommandService skips dispatch - keep compare behavior read-only with no transaction/update side effects - add regression assertions that compare calls do not emit transaction events (with omitted and explicit comments inputs)

- refactor getInsertionPos to use only oldNodes + oldIdx and derive the previous node internally - remove redundant previousOldNodeInfo plumbing from generic and paragraph add-diff builders - keep depth-aware insertion behavior for deeper/same/shallower transitions - update diff utility and paragraph diff tests to match the new API and validate anchor resolution

- reuse getCommentDocumentId / belongsToDocument from comments store in replay update reconciliation - expose those helpers from comments-store to avoid duplicating document-scoping logic in SuperDoc.vue - constrain replay UPDATE comment matching to the active editor document when IDs collide - add regression coverage for same importedId across multiple documents so only the active document comment is updated

- restrict replay DELETED handling to comments that belong to the active editor document - apply document scoping consistently across seed matching, subtree expansion, and final filtering - prevent cross-document comment/thread removal when IDs overlap across open docs - keep active-comment clearing document-aware using pre-delete comment snapshot - add regression test for overlapping IDs across documents during replay deletion

- replace raw replay Object.assign with a safe field-level updater for existing comment models - update only an explicit allowlist of mutable replay fields and normalize commentText from payload text/commentText - preserve useComment identity invariants (commentId/importedId and constructor-captured metadata) - strengthen replay update test to assert model identity remains stable via getValues() after updates

- add replay comment payload normalization to always emit commentId and preserve/add ownership metadata (documentId, fileId) - use editor.options.documentId as fallback when replay comment payload lacks document scope - keep existing payload documentId/fileId unchanged when already present - update replay editor typing in replayComments/replayDiffs to include optional options.documentId - add tests covering both fallback ownership injection and preservation of existing ownership fields

- update tracked-change stale-prune liveness check to consider both commentId and importedId - prevent pruning live tracked-change threads when runtime commentId diverges but mark IDs still match importedId - keep existing descendant removal behavior unchanged - add regression test for replay/import flows where importedId remains stable while commentId changes

…rtedId - seed tracked-change existingIds with both runtime commentId and stable importedId - prevent duplicate tracked-change thread creation when grouped mark id matches an existing imported id - add created sync IDs (id, params.changeId, params.importedId) back into dedupe set during sync pass - add regression test for mixed-ID replay/sync scenario where commentId diverges but importedId remains live

- make useComment store docxCommentJSON as reactive state instead of a construction-time constant - update getValues() to return the current docxCommentJSON value - ensure replay-updated imported comment structure is reflected in translateCommentsForExport output - add unit test verifying getValues() returns updated docxCommentJSON after mutation

- add replay payload normalization for comment model creation (text -> commentText, elements -> docxCommentJSON) - apply normalization in replay ADD path before useComment(...) - reuse the same normalization in replay UPDATE fallback when creating missing comments - ensure replay-added imported comments keep DOCX-native body structure for export/ round-trip - add regression test verifying replay ADD maps elements into docxCommentJSON

- map replay isDone updates to resolvedTime/resolvedBy* when payload resolved fields are null/missing - apply the same fallback during replay payload normalization and model updates - refactor shared isDone resolution fallback logic to avoid duplicated code - add regression test covering replay update payloads with isDone: true and null resolved fields, ensuring resolved state is persisted and can be exported

…ent thread - return the matched comment’s concrete id (prefer commentId) from replay update matching - avoid cross-document active-thread misselection when importedId overlaps across open documents - update replay regression coverage to assert setActiveComment receives the active document’s thread id

… reselection - only sync active comment when activeCommentId is explicitly present in the event payload - avoid inferring active selection from replay add/update events to prevent repeated focus/unfocus churn - preserve explicit active clear behavior on replay deletions - update replay update test expectation to reflect non-selecting replay events

…remap

…ates

superdoc-bot bot added the risk: sensitive label Mar 5, 2026

luccas-harbour marked this pull request as ready for review March 5, 2026 16:39

luccas-harbour changed the title ~~feat: diffing method to convert the differences into tracked (SD-1324)~~ feat: diffing extension for comparing documents (SD-1324) Mar 5, 2026

chatgpt-codex-connector bot reviewed Mar 5, 2026

View reviewed changes

packages/superdoc/src/SuperDoc.vue Show resolved Hide resolved

packages/superdoc/src/SuperDoc.vue Show resolved Hide resolved

luccas-harbour self-assigned this Mar 5, 2026

luccas-harbour changed the title ~~feat: diffing extension for comparing documents (SD-1324)~~ feat: diffing extension for comparing documents (SD-1324 and SD-89) Mar 5, 2026

luccas-harbour requested review from VladaHarbour, caio-pizzol, harbournick and tupizz March 5, 2026 18:02

caio-pizzol reviewed Mar 5, 2026

View reviewed changes

luccas-harbour requested a review from caio-pizzol March 6, 2026 19:38

luccas-harbour force-pushed the luccas/sd-1324-diffing-method-to-convert-the-differences-into-tracked branch from 7557b8b to 527891f Compare March 12, 2026 13:46

Luccas Correa added 15 commits March 12, 2026 10:49

feat: add function for mapping doc paragraphs by id

c6d9b26

feat: add function for flattening paragraph text but keep track of po…

57beb16

…sitions

feat: add function for calculating text diffs using LCS

e6968be

feat: add function for calculating paragraph-level diffing

467ce85

feat: add diffing extension

44bedd4

test: add diffing tests

96d2e3b

refactor: switch LCS algorith to Myers for performance

9f60c30

refactor: code structure

710f380

feat: compute text similarity using Levenshtein distance

65c98b4

feat: identify contiguous text changes as single operation

9583614

feat: implement logic for diffing paragraph attributes

8cd8e58

refactor: extract generic sequence diffing helper

738bdc9

refactor: modify paragraph diffing to reuse generic helper

5b1efa2

refactor: extract operation reordering function

8b11027

This function can then be reused when diffing paragraphs and runs. It helps identifying modifications instead of delete/insert pairs

fix: standardize positions for text diffing

6760d6e

Always maps starting/ending positions to the old document instead of the new one.

luccas-harbour added 27 commits March 12, 2026 10:54

test: add missing test documents

6fd5e45

test: adjust diff replay test

3c8a4a7

fix(superdoc): update replay comment parent linkage fields on thread …

9afedd8

…remap

fix(superdoc): scope replay add deduplication to active document context

aad7a87

fix(track-changes): preserve property attrs for ReplaceAroundStep upd…

c4da408

…ates

feat: implement diffing for styles

1bd0486

feat: implement replay for style differences

f8f4941

feat: implement diffing for numbering

0c40190

feat: implement replay for numbering differences

af03e4e

fix(comments): scope tracked-change dedupe to active document

194bfda

refactor(diffing): share deepEquals helper across replay modules

d1c6b62

chore: add diffing example

42aa5f2

fix(numbering): missing imports

68755e5

luccas-harbour force-pushed the luccas/sd-1324-diffing-method-to-convert-the-differences-into-tracked branch from 527891f to 68755e5 Compare March 12, 2026 13:55

luccas-harbour added 2 commits March 12, 2026 10:57

chore: update lock file

25a2ae4

test: add missing test documents

d478c73

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306

feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306
luccas-harbour wants to merge 125 commits intomainfrom
luccas/sd-1324-diffing-method-to-convert-the-differences-into-tracked

luccas-harbour commented Mar 5, 2026 •

edited

Loading

Uh oh!

linear bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

caio-pizzol left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luccas-harbour commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luccas-harbour commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What’s Included

Tests

Notes

Uh oh!

linear bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

caio-pizzol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luccas-harbour commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

luccas-harbour commented Mar 5, 2026 •

edited

Loading

linear bot commented Mar 5, 2026 •

edited

Loading