feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306
feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306luccas-harbour wants to merge 125 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9b926bfd1d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
caio-pizzol
left a comment
There was a problem hiding this comment.
@luccas-harbour — solid approach overall, and the test coverage for the new diffing extension is impressive (co-located tests for every algorithm and replay module, plus end-to-end coverage).
Correctness: two things to look at — heading text changes can be missed by the diff, and tracked-change comment IDs can collide across documents. one question about the run properties skip when marks change.
DX: small cleanup opportunity with duplicated helpers.
Tests: coverage is strong. only minor gaps around the belongsToDocument legacy fallback in the comments store.
left a few inline comments with details.
packages/super-editor/src/extensions/diffing/algorithm/generic-diffing.ts
Show resolved
Hide resolved
packages/super-editor/src/extensions/diffing/replay/marks-from-diff.ts
Outdated
Show resolved
Hide resolved
|
Note: no additional tweaks came up in feedback session. |
7557b8b to
527891f
Compare
This function can then be reused when diffing paragraphs and runs. It helps identifying modifications instead of delete/insert pairs
Always maps starting/ending positions to the old document instead of the new one.
…tity fallback - resolve delete targets from both importedId and commentId (importedId prioritized) - seed subtree removal with all resolved ids to match re-keyed replay comments - keep active comment clearing aligned with the expanded removal set - add regression test for stale runtime commentId + stable importedId delete payloads to prevent orphaned sidebar threads
…nsactions - set preventDispatch meta in compareDocuments so CommandService skips dispatch - keep compare behavior read-only with no transaction/update side effects - add regression assertions that compare calls do not emit transaction events (with omitted and explicit comments inputs)
- refactor getInsertionPos to use only oldNodes + oldIdx and derive the previous node internally - remove redundant previousOldNodeInfo plumbing from generic and paragraph add-diff builders - keep depth-aware insertion behavior for deeper/same/shallower transitions - update diff utility and paragraph diff tests to match the new API and validate anchor resolution
- reuse getCommentDocumentId / belongsToDocument from comments store in replay update reconciliation - expose those helpers from comments-store to avoid duplicating document-scoping logic in SuperDoc.vue - constrain replay UPDATE comment matching to the active editor document when IDs collide - add regression coverage for same importedId across multiple documents so only the active document comment is updated
- restrict replay DELETED handling to comments that belong to the active editor document - apply document scoping consistently across seed matching, subtree expansion, and final filtering - prevent cross-document comment/thread removal when IDs overlap across open docs - keep active-comment clearing document-aware using pre-delete comment snapshot - add regression test for overlapping IDs across documents during replay deletion
- replace raw replay Object.assign with a safe field-level updater for existing comment models - update only an explicit allowlist of mutable replay fields and normalize commentText from payload text/commentText - preserve useComment identity invariants (commentId/importedId and constructor-captured metadata) - strengthen replay update test to assert model identity remains stable via getValues() after updates
- add replay comment payload normalization to always emit commentId and preserve/add ownership metadata (documentId, fileId) - use editor.options.documentId as fallback when replay comment payload lacks document scope - keep existing payload documentId/fileId unchanged when already present - update replay editor typing in replayComments/replayDiffs to include optional options.documentId - add tests covering both fallback ownership injection and preservation of existing ownership fields
- update tracked-change stale-prune liveness check to consider both commentId and importedId - prevent pruning live tracked-change threads when runtime commentId diverges but mark IDs still match importedId - keep existing descendant removal behavior unchanged - add regression test for replay/import flows where importedId remains stable while commentId changes
…rtedId - seed tracked-change existingIds with both runtime commentId and stable importedId - prevent duplicate tracked-change thread creation when grouped mark id matches an existing imported id - add created sync IDs (id, params.changeId, params.importedId) back into dedupe set during sync pass - add regression test for mixed-ID replay/sync scenario where commentId diverges but importedId remains live
- make useComment store docxCommentJSON as reactive state instead of a construction-time constant - update getValues() to return the current docxCommentJSON value - ensure replay-updated imported comment structure is reflected in translateCommentsForExport output - add unit test verifying getValues() returns updated docxCommentJSON after mutation
- add replay payload normalization for comment model creation (text -> commentText,
elements -> docxCommentJSON)
- apply normalization in replay ADD path before useComment(...)
- reuse the same normalization in replay UPDATE fallback when creating missing comments
- ensure replay-added imported comments keep DOCX-native body structure for export/
round-trip
- add regression test verifying replay ADD maps elements into docxCommentJSON
- map replay isDone updates to resolvedTime/resolvedBy* when payload resolved fields are null/missing - apply the same fallback during replay payload normalization and model updates - refactor shared isDone resolution fallback logic to avoid duplicated code - add regression test covering replay update payloads with isDone: true and null resolved fields, ensuring resolved state is persisted and can be exported
…ent thread - return the matched comment’s concrete id (prefer commentId) from replay update matching - avoid cross-document active-thread misselection when importedId overlaps across open documents - update replay regression coverage to assert setActiveComment receives the active document’s thread id
… reselection
- only sync active comment when activeCommentId is explicitly present in the event
payload
- avoid inferring active selection from replay add/update events to prevent repeated
focus/unfocus churn
- preserve explicit active clear behavior on replay deletions
- update replay update test expectation to reflect non-selecting replay events
527891f to
68755e5
Compare
Summary
This PR delivers an end-to-end document compare + replay workflow, including comment
diff/replay and tracked-changes integration.
What’s Included
@superdoc/super-editorwithcompareDocumentsandreplayDifferencescommands.nodes, attrs, marks, comments, styles and numbering properties.
applyTrackedChangessupport in replay.explicitly requested.
documentId/fileId) in replay comment payloads.docxCommentJSON) on add/update.docxCommentJSONis reflected bygetValues()for export.isDonefallback to resolved fields when replay payload omits explicitresolved metadata.
Tests
computeDiff).
comments, replay-attrs, marks-from-diff).
comment.test.js).
including additional cases up to 11).
replayDiffs.test.js (passing).
Notes
mark run attrs on some inline text additions. This can be addressed once we stop using marks for formatting and use run properties directly.