Skip to content

chore: reorganize grammar #38959

Open
ggazzo wants to merge 6 commits intodevelopfrom
chore/grammar
Open

chore: reorganize grammar #38959
ggazzo wants to merge 6 commits intodevelopfrom
chore/grammar

Conversation

@ggazzo
Copy link
Member

@ggazzo ggazzo commented Feb 24, 2026

This pull request refactors the packages/message-parser/src/grammar.pegjs file to simplify and optimize the parsing rules for inline formatting, links, and code blocks. The changes remove unnecessary state variables, streamline rule definitions, and improve performance by reducing per-character lookaheads and simplifying fallback logic. The parser is now more maintainable and efficient, with clearer separation between plain text and formatting elements.

Inline formatting and emphasis simplification:

  • Removed skipBoldEmoji, skipItalicEmoji, and skipInlineEmoji state variables, and refactored emphasis (Bold, Italic, Strikethrough) and inline item rules to eliminate reliance on these flags, simplifying the parsing logic for emojis and emoticons within formatting. [1] [2] [3]
  • Added specialized "PlainRun" rules for bold, italic, and strikethrough to improve handling of nested formatting and plain text, and updated fallback patterns to use these.

Link and URL parsing improvements:

  • Refactored URL, MarkdownLinkURL, and AutoLinkURL rules to concatenate head and tail components, reducing per-character lookahead and improving performance. [1] [2] [3]
  • Enhanced LinkTitle rule with complex negative lookahead to prevent accidental matching of link end patterns, making link parsing more robust.

Code block and inline code parsing:

  • Improved CodeChunk and InlineCode rules to use character classes and direct matches, avoiding inefficient per-character lookahead and ensuring correct handling of backticks and newlines. [1] [2]

General parser optimizations and cleanup:

  • Simplified fallback rules for inline items, spoilers, and list items by removing unnecessary state logic and resetting flags, making the grammar easier to maintain. [1] [2]
  • Updated domain and heading parsing rules to clarify intent and reduce unnecessary complexity. [1] [2]

User mention parsing enhancement:

  • Modified UserMention rule to exclude usernames ending with double underscores, preventing invalid mentions.

Proposed changes (including videos or screenshots)

Issue(s)

Steps to test or reproduce

Further comments

before

message-parser git:(develop) yarn bench
========================================================================
  @rocket.chat/message-parser — Performance Benchmark Suite
========================================================================

── Plain Text ──────────────────────────────────────────────
┌─────────┬──────────┬──────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────┐
│ (index) │ Task     │ ops/sec  │ Avg (ms)    │ Min (ms)    │ Max (ms)    │ P99 (ms)    │ Samples │
├─────────┼──────────┼──────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────┤
│ 0       │ 'short'  │ '15.111' │ '68.4304'   │ '62.3750'   │ '568.7080'  │ '176.5622'  │ 14614   │
│ 1       │ 'medium' │ '3.804'  │ '265.9862'  │ '245.9170'  │ '612.2500'  │ '500.4495'  │ 3760    │
│ 2       │ 'long'   │ '348'    │ '2901.2264' │ '2709.5000' │ '8395.7920' │ '3601.0385' │ 345     │
└─────────┴──────────┴──────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────┘

── Emphasis / Formatting ───────────────────────────────────
┌─────────┬────────────────┬──────────┬────────────┬────────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task           │ ops/sec  │ Avg (ms)   │ Min (ms)   │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼────────────────┼──────────┼────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'bold'         │ '18.585' │ '57.6378'  │ '45.9160'  │ '1728.3340' │ '178.1672' │ 17350   │
│ 1       │ 'italic'       │ '18.931' │ '54.7093'  │ '50.0000'  │ '461.1670'  │ '105.3050' │ 18279   │
│ 2       │ 'strike'       │ '19.477' │ '52.6037'  │ '48.6670'  │ '588.7920'  │ '93.9045'  │ 19011   │
│ 3       │ 'nested'       │ '16.526' │ '61.9808'  │ '57.7080'  │ '532.1250'  │ '113.6866' │ 16135   │
│ 4       │ 'deep nesting' │ '12.889' │ '83.1256'  │ '72.6670'  │ '5613.4590' │ '256.6172' │ 12030   │
│ 5       │ 'multiple'     │ '8.870'  │ '114.3897' │ '107.1660' │ '425.1660'  │ '252.9589' │ 8743    │
└─────────┴────────────────┴──────────┴────────────┴────────────┴─────────────┴────────────┴─────────┘

── URLs & Links ────────────────────────────────────────────
┌─────────┬─────────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task                │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼─────────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'single'            │ '10.979' │ '92.5930' │ '86.9580' │ '451.5420' │ '220.7155' │ 10800   │
│ 1       │ 'multiple'          │ '12.911' │ '78.3832' │ '74.2910' │ '371.3330' │ '167.7325' │ 12758   │
│ 2       │ 'markdown link'     │ '20.657' │ '49.3569' │ '46.2500' │ '311.9170' │ '66.9246'  │ 20261   │
│ 3       │ 'autolinked domain' │ '11.223' │ '89.9522' │ '86.0000' │ '285.8330' │ '183.3756' │ 11119   │
│ 4       │ 'with path'         │ '11.984' │ '84.8961' │ '79.8340' │ '357.5000' │ '233.0368' │ 11780   │
└─────────┴─────────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Emoji ───────────────────────────────────────────────────
┌─────────┬───────────────────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task                          │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼───────────────────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'single shortcode'            │ '28.983' │ '35.5232' │ '32.7920' │ '681.5000' │ '55.4790'  │ 28151   │
│ 1       │ 'triple shortcode (BigEmoji)' │ '29.240' │ '35.0089' │ '32.2920' │ '324.3750' │ '48.6398'  │ 28565   │
│ 2       │ 'single unicode'              │ '29.082' │ '35.2530' │ '32.5830' │ '291.6250' │ '47.9849'  │ 28369   │
│ 3       │ 'triple unicode (BigEmoji)'   │ '28.667' │ '35.6136' │ '33.0840' │ '258.2500' │ '47.3603'  │ 28080   │
│ 4       │ 'in text'                     │ '11.346' │ '89.0623' │ '85.0000' │ '307.9590' │ '184.5805' │ 11229   │
│ 5       │ 'mixed'                       │ '11.443' │ '88.4342' │ '84.0000' │ '344.0410' │ '188.8266' │ 11308   │
└─────────┴───────────────────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Mentions ────────────────────────────────────────────────
┌─────────┬──────────────────┬──────────┬───────────┬───────────┬─────────────┬───────────┬─────────┐
│ (index) │ Task             │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)    │ P99 (ms)  │ Samples │
├─────────┼──────────────────┼──────────┼───────────┼───────────┼─────────────┼───────────┼─────────┤
│ 0       │ 'single user'    │ '24.472' │ '41.5981' │ '38.9590' │ '307.0830'  │ '55.8840' │ 24040   │
│ 1       │ 'multiple users' │ '22.803' │ '44.8320' │ '41.8330' │ '518.0420'  │ '65.4580' │ 22306   │
│ 2       │ 'channel'        │ '24.329' │ '42.6889' │ '39.0420' │ '5187.2500' │ '60.9477' │ 23426   │
│ 3       │ 'mixed'          │ '14.142' │ '71.8212' │ '67.5830' │ '534.3750'  │ '98.6864' │ 13924   │
└─────────┴──────────────────┴──────────┴───────────┴───────────┴─────────────┴───────────┴─────────┘

── Code ────────────────────────────────────────────────────
┌─────────┬────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task           │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'inline'       │ '11.758' │ '86.0120' │ '81.6660' │ '392.6670' │ '185.3100' │ 11627   │
│ 1       │ 'block'        │ '25.397' │ '40.1935' │ '37.4170' │ '274.0830' │ '54.8836'  │ 24880   │
│ 2       │ 'multi inline' │ '15.207' │ '66.7020' │ '63.2500' │ '378.3330' │ '162.3373' │ 14993   │
└─────────┴────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Structured Blocks ───────────────────────────────────────
┌─────────┬───────────────────────────┬──────────┬────────────┬────────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task                      │ ops/sec  │ Avg (ms)   │ Min (ms)   │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼───────────────────────────┼──────────┼────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'ordered list'            │ '8.777'  │ '115.2011' │ '109.5410' │ '402.7080'  │ '225.4168' │ 8681    │
│ 1       │ 'unordered list'          │ '8.774'  │ '115.1760' │ '108.5000' │ '336.6250'  │ '228.9276' │ 8683    │
│ 2       │ 'task list'               │ '8.224'  │ '122.8483' │ '116.1250' │ '420.7090'  │ '235.5084' │ 8141    │
│ 3       │ 'blockquote'              │ '7.094'  │ '142.7896' │ '132.9590' │ '428.7920'  │ '314.9443' │ 7004    │
│ 4       │ 'heading'                 │ '26.246' │ '39.1366'  │ '36.2080'  │ '1160.9170' │ '55.4563'  │ 25552   │
│ 5       │ 'heading multi-level'     │ '25.056' │ '40.9797'  │ '38.0000'  │ '1739.0830' │ '67.9967'  │ 24403   │
│ 6       │ 'spoiler'                 │ '12.339' │ '82.5214'  │ '77.1670'  │ '475.4590'  │ '218.6225' │ 12119   │
│ 7       │ 'spoiler with formatting' │ '11.796' │ '86.3155'  │ '80.7500'  │ '375.9170'  │ '234.6875' │ 11586   │
└─────────┴───────────────────────────┴──────────┴────────────┴────────────┴─────────────┴────────────┴─────────┘

── KaTeX (Math) ────────────────────────────────────────────
┌─────────┬──────────┬──────────┬────────────┬────────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task     │ ops/sec  │ Avg (ms)   │ Min (ms)   │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼──────────┼──────────┼────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'inline' │ '8.913'  │ '113.5582' │ '105.5000' │ '1274.1670' │ '220.1219' │ 8807    │
│ 1       │ 'block'  │ '22.913' │ '44.7187'  │ '41.4170'  │ '356.2080'  │ '61.8395'  │ 22363   │
└─────────┴──────────┴──────────┴────────────┴────────────┴─────────────┴────────────┴─────────┘

── Adversarial / Stress ────────────────────────────────────
┌─────────┬────────────────────────┬─────────┬────────────┬────────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task                   │ ops/sec │ Avg (ms)   │ Min (ms)   │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼────────────────────────┼─────────┼────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'adversarial emphasis' │ '6.379' │ '157.9280' │ '151.5000' │ '400.7920'  │ '279.7595' │ 12664   │
│ 1       │ 'adversarial mixed'    │ '1.539' │ '653.4250' │ '625.6670' │ '1171.2500' │ '904.8582' │ 3061    │
│ 2       │ 'repeated specials'    │ '1.543' │ '650.5730' │ '627.5830' │ '1017.4170' │ '910.2189' │ 3075    │
│ 3       │ 'long with formatting' │ '2.806' │ '358.5738' │ '344.6250' │ '732.8340'  │ '547.4978' │ 5578    │
└─────────┴────────────────────────┴─────────┴────────────┴────────────┴─────────────┴────────────┴─────────┘

── Real-World Messages ─────────────────────────────────────
┌─────────┬───────────┬──────────┬────────────┬────────────┬────────────┬────────────┬─────────┐
│ (index) │ Task      │ ops/sec  │ Avg (ms)   │ Min (ms)   │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼───────────┼──────────┼────────────┼────────────┼────────────┼────────────┼─────────┤
│ 0       │ 'simple'  │ '10.240' │ '99.1179'  │ '93.9170'  │ '362.0830' │ '224.9019' │ 10090   │
│ 1       │ 'medium'  │ '6.714'  │ '151.2868' │ '143.3750' │ '477.0420' │ '355.5912' │ 6610    │
│ 2       │ 'complex' │ '5.345'  │ '190.7712' │ '175.3340' │ '685.8330' │ '416.4897' │ 5242    │
└─────────┴───────────┴──────────┴────────────┴────────────┴────────────┴────────────┴─────────┘

── Timestamps ──────────────────────────────────────────────
┌─────────┬───────────────┬──────────┬───────────┬───────────┬────────────┬───────────┬─────────┐
│ (index) │ Task          │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)  │ Samples │
├─────────┼───────────────┼──────────┼───────────┼───────────┼────────────┼───────────┼─────────┤
│ 0       │ 'unix format' │ '24.237' │ '42.7208' │ '39.0000' │ '596.7500' │ '82.0323' │ 23408   │
└─────────┴───────────────┴──────────┴───────────┴───────────┴────────────┴───────────┴─────────┘

========================================================================
  Done.
========================================================================

after

➜  message-parser git:(chore/grammar) yarn bench
========================================================================
  @rocket.chat/message-parser — Performance Benchmark Suite
========================================================================

── Plain Text ──────────────────────────────────────────────
┌─────────┬──────────┬──────────┬────────────┬────────────┬────────────┬────────────┬─────────┐
│ (index) │ Task     │ ops/sec  │ Avg (ms)   │ Min (ms)   │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼──────────┼──────────┼────────────┼────────────┼────────────┼────────────┼─────────┤
│ 0       │ 'short'  │ '31.328' │ '32.5552'  │ '27.7080'  │ '189.7910' │ '43.9929'  │ 30718   │
│ 1       │ 'medium' │ '14.798' │ '68.6805'  │ '57.4580'  │ '341.2090' │ '141.2334' │ 14561   │
│ 2       │ 'long'   │ '2.671'  │ '377.5606' │ '327.5840' │ '799.7910' │ '588.4722' │ 2649    │
└─────────┴──────────┴──────────┴────────────┴────────────┴────────────┴────────────┴─────────┘

── Emphasis / Formatting ───────────────────────────────────
┌─────────┬────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task           │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'bold'         │ '32.169' │ '31.6814' │ '27.0410' │ '207.8750' │ '46.2914'  │ 31565   │
│ 1       │ 'italic'       │ '30.592' │ '33.2980' │ '28.5420' │ '283.5420' │ '47.4873'  │ 30032   │
│ 2       │ 'strike'       │ '31.625' │ '32.2927' │ '27.2910' │ '246.9170' │ '48.3207'  │ 30967   │
│ 3       │ 'nested'       │ '27.691' │ '36.7411' │ '31.4170' │ '250.6660' │ '56.8269'  │ 27218   │
│ 4       │ 'deep nesting' │ '24.281' │ '42.4788' │ '35.2910' │ '930.7500' │ '106.2500' │ 23542   │
│ 5       │ 'multiple'     │ '20.359' │ '49.7510' │ '42.9160' │ '211.7080' │ '85.1670'  │ 20101   │
└─────────┴────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── URLs & Links ────────────────────────────────────────────
┌─────────┬─────────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task                │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼─────────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'single'            │ '24.926' │ '41.2133' │ '35.5410' │ '268.3330' │ '62.0267'  │ 24264   │
│ 1       │ 'multiple'          │ '21.999' │ '46.0478' │ '39.3330' │ '259.0830' │ '106.3750' │ 21717   │
│ 2       │ 'markdown link'     │ '29.555' │ '34.5999' │ '29.8330' │ '410.6250' │ '47.8322'  │ 28902   │
│ 3       │ 'autolinked domain' │ '23.351' │ '43.3978' │ '37.1660' │ '218.8340' │ '61.2426'  │ 23043   │
│ 4       │ 'with path'         │ '24.283' │ '41.7284' │ '35.9580' │ '231.2920' │ '59.9031'  │ 23965   │
└─────────┴─────────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Emoji ───────────────────────────────────────────────────
┌─────────┬───────────────────────────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task                          │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼───────────────────────────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'single shortcode'            │ '39.535' │ '25.8905' │ '22.0830' │ '248.2090' │ '33.1468'  │ 38625   │
│ 1       │ 'triple shortcode (BigEmoji)' │ '41.803' │ '24.4768' │ '20.9580' │ '196.6250' │ '30.7914'  │ 40856   │
│ 2       │ 'single unicode'              │ '39.383' │ '25.9376' │ '22.1660' │ '209.8330' │ '32.6029'  │ 38555   │
│ 3       │ 'triple unicode (BigEmoji)'   │ '39.296' │ '26.2804' │ '21.3340' │ '635.0000' │ '34.5415'  │ 38052   │
│ 4       │ 'in text'                     │ '22.390' │ '45.6065' │ '39.5420' │ '248.7090' │ '112.5311' │ 21927   │
│ 5       │ 'mixed'                       │ '21.707' │ '46.8180' │ '40.1670' │ '326.3750' │ '110.5006' │ 21360   │
└─────────┴───────────────────────────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Mentions ────────────────────────────────────────────────
┌─────────┬──────────────────┬──────────┬───────────┬───────────┬─────────────┬───────────┬─────────┐
│ (index) │ Task             │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)    │ P99 (ms)  │ Samples │
├─────────┼──────────────────┼──────────┼───────────┼───────────┼─────────────┼───────────┼─────────┤
│ 0       │ 'single user'    │ '35.712' │ '28.9189' │ '24.7500' │ '1785.3750' │ '42.7256' │ 34580   │
│ 1       │ 'multiple users' │ '33.485' │ '30.4172' │ '25.8330' │ '242.5000'  │ '38.4568' │ 32877   │
│ 2       │ 'channel'        │ '35.992' │ '28.4930' │ '24.6670' │ '326.8750'  │ '36.9177' │ 35097   │
│ 3       │ 'mixed'          │ '28.076' │ '36.2055' │ '30.9160' │ '247.0830'  │ '50.5668' │ 27621   │
└─────────┴──────────────────┴──────────┴───────────┴───────────┴─────────────┴───────────┴─────────┘

── Code ────────────────────────────────────────────────────
┌─────────┬────────────────┬──────────┬───────────┬───────────┬────────────┬───────────┬─────────┐
│ (index) │ Task           │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)  │ Samples │
├─────────┼────────────────┼──────────┼───────────┼───────────┼────────────┼───────────┼─────────┤
│ 0       │ 'inline'       │ '28.307' │ '36.4314' │ '30.0410' │ '352.7910' │ '74.7315' │ 27449   │
│ 1       │ 'block'        │ '36.003' │ '28.5580' │ '23.9170' │ '270.4580' │ '42.9933' │ 35017   │
│ 2       │ 'multi inline' │ '29.104' │ '35.2041' │ '29.7910' │ '303.5410' │ '64.8289' │ 28406   │
└─────────┴────────────────┴──────────┴───────────┴───────────┴────────────┴───────────┴─────────┘

── Structured Blocks ───────────────────────────────────────
┌─────────┬───────────────────────────┬──────────┬───────────┬───────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task                      │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼───────────────────────────┼──────────┼───────────┼───────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'ordered list'            │ '22.497' │ '45.7348' │ '37.9160' │ '307.8330'  │ '122.1193' │ 21866   │
│ 1       │ 'unordered list'          │ '23.240' │ '44.2787' │ '37.5830' │ '359.1250'  │ '90.6716'  │ 22585   │
│ 2       │ 'task list'               │ '22.752' │ '44.9961' │ '37.8750' │ '296.3750'  │ '106.8570' │ 22225   │
│ 3       │ 'blockquote'              │ '21.405' │ '48.9327' │ '40.0410' │ '1177.7920' │ '154.5201' │ 20437   │
│ 4       │ 'heading'                 │ '35.555' │ '30.3965' │ '23.4170' │ '1407.0000' │ '96.4672'  │ 32899   │
│ 5       │ 'heading multi-level'     │ '34.379' │ '30.7523' │ '24.4170' │ '1013.2500' │ '89.6990'  │ 32518   │
│ 6       │ 'spoiler'                 │ '26.286' │ '39.7276' │ '32.0830' │ '441.1250'  │ '118.0998' │ 25181   │
│ 7       │ 'spoiler with formatting' │ '23.752' │ '45.0810' │ '35.1250' │ '482.3330'  │ '148.4937' │ 22183   │
└─────────┴───────────────────────────┴──────────┴───────────┴───────────┴─────────────┴────────────┴─────────┘

── KaTeX (Math) ────────────────────────────────────────────
┌─────────┬──────────┬──────────┬───────────┬───────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task     │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼──────────┼──────────┼───────────┼───────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'inline' │ '18.953' │ '57.2044' │ '43.7080' │ '4187.1250' │ '244.7578' │ 17482   │
│ 1       │ 'block'  │ '30.855' │ '35.1737' │ '26.8330' │ '2978.5000' │ '158.5626' │ 28431   │
└─────────┴──────────┴──────────┴───────────┴───────────┴─────────────┴────────────┴─────────┘

── Adversarial / Stress ────────────────────────────────────
┌─────────┬────────────────────────┬─────────┬────────────┬────────────┬─────────────┬────────────┬─────────┐
│ (index) │ Task                   │ ops/sec │ Avg (ms)   │ Min (ms)   │ Max (ms)    │ P99 (ms)   │ Samples │
├─────────┼────────────────────────┼─────────┼────────────┼────────────┼─────────────┼────────────┼─────────┤
│ 0       │ 'adversarial emphasis' │ '7.785' │ '132.4414' │ '108.3750' │ '904.4170'  │ '367.8625' │ 15102   │
│ 1       │ 'adversarial mixed'    │ '5.328' │ '193.2182' │ '157.9580' │ '3082.6250' │ '475.0625' │ 10351   │
│ 2       │ 'repeated specials'    │ '4.613' │ '220.2706' │ '183.7090' │ '631.5420'  │ '442.0701' │ 9080    │
│ 3       │ 'long with formatting' │ '5.550' │ '183.9862' │ '152.0420' │ '576.6670'  │ '404.4875' │ 10871   │
└─────────┴────────────────────────┴─────────┴────────────┴────────────┴─────────────┴────────────┴─────────┘

── Real-World Messages ─────────────────────────────────────
┌─────────┬───────────┬──────────┬───────────┬───────────┬────────────┬────────────┬─────────┐
│ (index) │ Task      │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)   │ Samples │
├─────────┼───────────┼──────────┼───────────┼───────────┼────────────┼────────────┼─────────┤
│ 0       │ 'simple'  │ '23.891' │ '43.6171' │ '35.0830' │ '385.1670' │ '107.5617' │ 22927   │
│ 1       │ 'medium'  │ '18.046' │ '57.4665' │ '46.8750' │ '648.1250' │ '169.7442' │ 17402   │
│ 2       │ 'complex' │ '11.471' │ '90.2881' │ '72.8750' │ '542.2080' │ '259.7923' │ 11076   │
└─────────┴───────────┴──────────┴───────────┴───────────┴────────────┴────────────┴─────────┘

── Timestamps ──────────────────────────────────────────────
┌─────────┬───────────────┬──────────┬───────────┬───────────┬────────────┬───────────┬─────────┐
│ (index) │ Task          │ ops/sec  │ Avg (ms)  │ Min (ms)  │ Max (ms)   │ P99 (ms)  │ Samples │
├─────────┼───────────────┼──────────┼───────────┼───────────┼────────────┼───────────┼─────────┤
│ 0       │ 'unix format' │ '31.239' │ '33.4720' │ '26.7920' │ '385.3340' │ '67.0935' │ 29876   │
└─────────┴───────────────┴──────────┴───────────┴───────────┴────────────┴───────────┴─────────┘

========================================================================
  Done.
========================================================================

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced optimized message format with bidirectional conversion capabilities for efficient storage and display.
    • Added validation and testing tools to measure compression efficiency and verify round-trip format integrity.
    • Enhanced message parsing with streamlined grammar rules and improved handling of inline formatting elements.
  • Tests

    • Added comprehensive test suite covering format conversion, expansion, and round-trip validation scenarios.

@dionisio-bot
Copy link
Contributor

dionisio-bot bot commented Feb 24, 2026

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@changeset-bot
Copy link

changeset-bot bot commented Feb 24, 2026

⚠️ No Changeset found

Latest commit: e870177

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Walkthrough

A new compact message-parser format module is introduced with bidirectional conversion APIs (expand, compactify, validateRoundtrip). The grammar is refactored to simplify parsing by removing skip-state flags and standardizing value returns. Comprehensive tests verify round-trip integrity and size reduction.

Changes

Cohort / File(s) Summary
Compact Format Implementation
packages/message-parser/src/compact.ts
New module providing bidirectional conversion between verbose AST and compact span-based formats. Exports 8 types (Span, CInline, CInlineNode, CBlock, CLI, CTK, CBigEmoji, CRoot) and 3 public APIs: expand converts compact to verbose AST; compactify converts verbose to compact format; validateRoundtrip performs round-trip validation with size metrics. Includes extensive inline/block element handling and error management.
Grammar Refactoring
packages/message-parser/src/grammar.pegjs
Removes emoji-skip state flags (skipBoldEmoji, skipItalicEmoji, skipInlineEmoji) and simplifies parsing logic. Unifies time/date parsing outputs to use direct string captures instead of array joins. Reworks code chunk and heading parsing with explicit character classes. Reorganizes inline/item parsing alternatives, replaces skip-flag branching with direct returns, and introduces PlainRun helpers. Refactors URL/link handling to use head/tail composition over array concatenation.
Test Suite
packages/message-parser/tests/compact.test.ts
New comprehensive test suite covering expand, compactify, and round-trip functionality. Includes tests for plain text, mentions, formatting (bold/italic/strike), inline code, emoji, headings, code blocks, line breaks, lists, KATEX, timestamps, and real-world message scenarios. Defines helper functions (expectRoundtrip, expectExpand) and verifies size reduction metrics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested labels

type: feature

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'chore: reorganize grammar' is vague and does not accurately represent the main changes. The PR introduces significant new functionality (compact AST format with expand/compactify functions) alongside grammar reorganization. Revise the title to reflect the primary change, such as: 'feat: add compact AST format with bidirectional conversion' or 'feat: add compact message parser format and grammar refactor'.
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 83.23864% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.79%. Comparing base (86af4e2) to head (e870177).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop   #38959      +/-   ##
===========================================
+ Coverage    70.73%   70.79%   +0.05%     
===========================================
  Files         3195     3196       +1     
  Lines       113105   113535     +430     
  Branches     20510    20663     +153     
===========================================
+ Hits         80010    80376     +366     
- Misses       31049    31113      +64     
  Partials      2046     2046              
Flag Coverage Δ
e2e 60.37% <ø> (-0.04%) ⬇️
e2e-api 47.75% <ø> (-0.10%) ⬇️
unit 71.46% <83.23%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ggazzo ggazzo added this to the 8.3.0 milestone Feb 27, 2026
@ggazzo ggazzo changed the title chore: grammar test chore: reorganize grammar Feb 27, 2026
@ggazzo ggazzo marked this pull request as ready for review February 27, 2026 20:31
@coderabbitai coderabbitai bot added the type: feature Pull requests that introduces new feature label Feb 27, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
packages/message-parser/src/compact.ts (2)

91-93: Consider adding validation for span bounds.

The type guard correctly identifies spans but doesn't validate that indices are non-negative or that start <= end. Invalid spans could cause msg.slice() to return unexpected results (empty string for inverted spans, or from end of string for negative indices).

This is low risk since spans are generated by internal code, but defensive validation could help catch bugs earlier.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 91 - 93, The isSpan type
guard currently checks array shape but not bounds; update isSpan (and any Span
type checks) to also validate that both elements are integers >= 0 and that the
first element is <= the second (start <= end) so downstream uses like
msg.slice() won't receive negative or inverted indices; locate the function
isSpan and add these numeric/ordering checks (and consider using
Number.isInteger on v[0]/v[1]) to return false for invalid spans.

121-131: Type narrowing could be cleaner.

The current approach uses (node as any).c which loses type safety. Consider using a type guard or checking the discriminated property more explicitly.

♻️ Suggested improvement
 case 'a': {
-  if ('c' in node && Array.isArray((node as any).c)) {
-    const rich = node as { t: 'a'; c: CInline[]; s: string };
+  if ('c' in node) {
+    const rich = node as Extract<CInlineNode, { t: 'a'; c: CInline[] }>;
     return {
       type: 'LINK',
       value: {
         src: { type: 'PLAIN_TEXT', value: rich.s } as Plain,
         label: rich.c.map((c) => expandInline(msg, c)) as Markup[],
       },
     } as Link;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 121 - 131, The case 'a'
branch uses (node as any).c which bypasses type safety; replace that check with
a proper type guard or narrower type assertion so TypeScript can infer c exists
and is CInline[]—for example add a predicate like isAnchorNode(node): node is {
t: 'a'; c: CInline[]; s: string } and use it in the if, then cast node to rich
(or let the guard narrow it) before mapping with expandInline to construct the
Link/Plain/Markup objects; update the branch that returns Link to use the
narrowed type instead of any.
packages/message-parser/tests/compact.test.ts (1)

322-339: Consider making benchmark logging conditional.

The console.log statement on line 338 will output on every test run, which may clutter CI logs. Consider gating this behind an environment variable or using a test reporter.

♻️ Suggested improvement
     expect(result.ok).toBe(true);
     expect(result.sizeNew).toBeLessThan(result.sizeOld);

-    console.log(`  ${_label}: ${result.sizeOld}B → ${result.sizeNew}B (${result.reduction})`);
+    if (process.env.DEBUG) {
+      console.log(`  ${_label}: ${result.sizeOld}B → ${result.sizeNew}B (${result.reduction})`);
+    }
   });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/compact.test.ts` around lines 322 - 339, The
console.log inside the 'size reduction' test is unconditionally printing on
every run; guard that diagnostic output by checking an env var (e.g.,
process.env.SHOW_BENCHMARK) or a test-specific flag before calling console.log
so CI stays clean; locate the console.log in the size reduction describe/it
block in compact.test.ts and wrap it with the conditional check (or replace with
a test reporter/debug logger) so the message only prints when the env var/flag
is set.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/message-parser/src/compact.ts`:
- Around line 268-271: The function advancePast(ctx: Ctx, str: string) currently
silently does nothing if str isn't found; change its signature to return a
boolean (true if found and ctx.pos advanced, false otherwise) and update callers
to check the return value and handle the failure (either propagate an error or
throw at the appropriate parse boundary). Locate advancePast and all call sites
that pass ctx.msg/ctx.pos, implement the boolean return, and add explicit error
handling where a missing delimiter would desynchronize parsing (or throw from
the caller if preferred).

In `@packages/message-parser/tests/compact.test.ts`:
- Around line 24-26: The size-reduction assertion in the test is too strict for
tiny inputs: update the expectation in
packages/message-parser/tests/compact.test.ts to allow equality for edge cases
by replacing expect(result.sizeNew).toBeLessThan(result.sizeOld) with
expect(result.sizeNew).toBeLessThanOrEqual(result.sizeOld), or alternatively add
a minimum message-length guard so the check only runs when the original message
length (or result.sizeOld) exceeds a small threshold; ensure you update the
assertion around the same `result` variable used in the test.

---

Nitpick comments:
In `@packages/message-parser/src/compact.ts`:
- Around line 91-93: The isSpan type guard currently checks array shape but not
bounds; update isSpan (and any Span type checks) to also validate that both
elements are integers >= 0 and that the first element is <= the second (start <=
end) so downstream uses like msg.slice() won't receive negative or inverted
indices; locate the function isSpan and add these numeric/ordering checks (and
consider using Number.isInteger on v[0]/v[1]) to return false for invalid spans.
- Around line 121-131: The case 'a' branch uses (node as any).c which bypasses
type safety; replace that check with a proper type guard or narrower type
assertion so TypeScript can infer c exists and is CInline[]—for example add a
predicate like isAnchorNode(node): node is { t: 'a'; c: CInline[]; s: string }
and use it in the if, then cast node to rich (or let the guard narrow it) before
mapping with expandInline to construct the Link/Plain/Markup objects; update the
branch that returns Link to use the narrowed type instead of any.

In `@packages/message-parser/tests/compact.test.ts`:
- Around line 322-339: The console.log inside the 'size reduction' test is
unconditionally printing on every run; guard that diagnostic output by checking
an env var (e.g., process.env.SHOW_BENCHMARK) or a test-specific flag before
calling console.log so CI stays clean; locate the console.log in the size
reduction describe/it block in compact.test.ts and wrap it with the conditional
check (or replace with a test reporter/debug logger) so the message only prints
when the env var/flag is set.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86af4e2 and e870177.

📒 Files selected for processing (3)
  • packages/message-parser/src/compact.ts
  • packages/message-parser/src/grammar.pegjs
  • packages/message-parser/tests/compact.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: cubic · AI code reviewer
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (.cursor/rules/playwright.mdc)

**/*.{ts,tsx,js}: Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests
Avoid code comments in the implementation

Files:

  • packages/message-parser/tests/compact.test.ts
  • packages/message-parser/src/compact.ts
🧠 Learnings (11)
📓 Common learnings
Learnt from: gabriellsh
Repo: RocketChat/Rocket.Chat PR: 37419
File: packages/i18n/src/locales/en.i18n.json:918-921
Timestamp: 2025-11-19T18:20:07.720Z
Learning: Repo: RocketChat/Rocket.Chat — i18n/formatting
Learning: This repository uses a custom message formatting parser in UI blocks/messages; do not assume standard Markdown rules. For keys like Call_ended_bold, Call_not_answered_bold, Call_failed_bold, and Call_transferred_bold in packages/i18n/src/locales/en.i18n.json, retain the existing single-asterisk emphasis unless maintainers request otherwise.
Learnt from: ahmed-n-abdeltwab
Repo: RocketChat/Rocket.Chat PR: 0
File: :0-0
Timestamp: 2026-02-24T19:05:56.710Z
Learning: In Rocket.Chat PRs, keep feature PRs free of unrelated lockfile-only dependency bumps; prefer reverting lockfile drift or isolating such bumps into a separate "chore" commit/PR, and always use yarn install --immutable with the Yarn version pinned in package.json via Corepack.
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Utilize Playwright fixtures (`test`, `page`, `expect`) for consistency in test files

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-12-10T21:00:54.909Z
Learnt from: KevLehman
Repo: RocketChat/Rocket.Chat PR: 37091
File: ee/packages/abac/jest.config.ts:4-7
Timestamp: 2025-12-10T21:00:54.909Z
Learning: Rocket.Chat monorepo: Jest testMatch pattern '<rootDir>/src/**/*.spec.(ts|js|mjs)' is valid in this repo and used across multiple packages (e.g., packages/tools, ee/packages/omnichannel-services). Do not flag it as invalid in future reviews.

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Use `expect` matchers for assertions (`toEqual`, `toContain`, `toBeTruthy`, `toHaveLength`, etc.) instead of `assert` statements in Playwright tests

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Use `test.step()` for complex test scenarios to improve organization in Playwright tests

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Ensure tests run reliably in parallel without shared state conflicts

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to **/*.spec.ts : Use descriptive test names that clearly communicate expected behavior in Playwright tests

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Group related tests in the same file

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to **/*.{ts,tsx,js} : Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests

Applied to files:

  • packages/message-parser/tests/compact.test.ts
📚 Learning: 2026-02-26T19:25:44.063Z
Learnt from: gabriellsh
Repo: RocketChat/Rocket.Chat PR: 38778
File: packages/ui-voip/src/providers/useMediaSession.ts:192-192
Timestamp: 2026-02-26T19:25:44.063Z
Learning: In the Rocket.Chat repository, do not reference Biome lint rules in code review feedback. Biome is not used even if biome.json exists; only reference Biome rules if there is explicit, project-wide usage documented. For TypeScript files, review lint implications without Biome guidance unless the project enables Biome rules.

Applied to files:

  • packages/message-parser/tests/compact.test.ts
  • packages/message-parser/src/compact.ts
📚 Learning: 2026-02-26T19:25:44.063Z
Learnt from: gabriellsh
Repo: RocketChat/Rocket.Chat PR: 38778
File: packages/ui-voip/src/providers/useMediaSession.ts:192-192
Timestamp: 2026-02-26T19:25:44.063Z
Learning: In this repository (RocketChat/Rocket.Chat), Biome lint rules are not used even if a biome.json exists. When reviewing TypeScript files (e.g., packages/ui-voip/src/providers/useMediaSession.ts), ensure lint suggestions do not reference Biome-specific rules. Rely on general ESLint/TypeScript lint rules and project conventions instead.

Applied to files:

  • packages/message-parser/tests/compact.test.ts
  • packages/message-parser/src/compact.ts
🧬 Code graph analysis (1)
packages/message-parser/tests/compact.test.ts (1)
packages/message-parser/src/compact.ts (4)
  • validateRoundtrip (595-615)
  • CRoot (85-85)
  • expand (241-248)
  • compactify (549-569)
🔇 Additional comments (23)
packages/message-parser/src/compact.ts (9)

1-9: LGTM!

Clear module documentation explaining the bidirectional conversion purpose and the three main APIs.


49-86: LGTM!

Compact type definitions are well-structured with appropriate discriminated unions. The short property names effectively reduce serialized size.


177-239: LGTM!

The expandBlock function comprehensively handles all block types with correct reconstruction logic for code blocks and nested structures.


241-248: LGTM!

Clean entry point that correctly handles the BigEmoji special case and delegates to expandBlock.


303-432: LGTM!

The compactInline function correctly handles all inline node types with appropriate delimiter detection and span calculation.


436-468: LGTM with observation.

The block compactification handles all cases correctly. The code block handling assumes the grammar-enforced structure with newlines around content.


549-569: LGTM!

Clean entry point that properly handles the BigEmoji case and block iteration with appropriate newline handling.


575-593: LGTM!

The deepEqual implementation is appropriate for comparing AST structures. Key sorting ensures consistent comparison regardless of property order.


595-615: LGTM!

The validation function provides comprehensive roundtrip testing with useful size metrics for analysis.

packages/message-parser/tests/compact.test.ts (3)

39-150: LGTM!

Comprehensive test coverage for the expand function covering all major inline and block types.


152-207: LGTM!

Good basic coverage for compactify. More complex cases are covered in the roundtrip tests.


209-273: LGTM!

Excellent roundtrip test coverage spanning plain text, emphasis, mentions, code, emoji, links, structured blocks, katex, timestamps, emoticons, and real-world message scenarios.

packages/message-parser/src/grammar.pegjs (11)

97-97: LGTM!

The simplified Unixtime rule with exact 10 digits is appropriate for current and near-future timestamps.


132-134: LGTM!

The CodeChunkChar refactor using character classes is a sound optimization that reduces per-character lookahead while correctly handling backtick sequences.


150-150: LGTM!

Simplified heading text capture is cleaner and functionally equivalent.


253-291: LGTM!

The InlineItemPattern reorganization improves parsing clarity. The PlainRunBeforeEmoticon rule correctly handles the edge case for emoticons like -_-.


298-307: LGTM!

Spoiler parsing simplified by removing unnecessary state flags. The direct return pattern is cleaner.


336-337: LGTM!

The head+tail URL composition pattern is efficient and consistent across MarkdownLinkURL, URL, and AutoLinkURL rules.


439-440: LGTM!

The PlainUnderscoreThenDomain rule correctly handles edge cases where underscore-prefixed domains should be plain text rather than italic markers.


530-593: LGTM!

The simplified emphasis content rules with specialized PlainRun variants correctly exclude characters needed for nested formatting while removing state flag complexity.


625-631: LGTM!

The UserMention rule update correctly excludes usernames ending with __ using a semantic predicate, preventing conflicts with emphasis parsing.


781-781: LGTM!

The InlineCode rule using [^\n]+` character class is more efficient than per-character lookahead.


808-808: LGTM!

The Any rule change to match any non-EndOfLine character is appropriate as a fallback for characters not captured by specialized rules.

Comment on lines +268 to +271
function advancePast(ctx: Ctx, str: string) {
const i = ctx.msg.indexOf(str, ctx.pos);
if (i >= 0) ctx.pos = i + str.length;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent failure when string not found could mask bugs.

advancePast silently does nothing when the target string isn't found. This could cause the parser position to become desynchronized, leading to incorrect spans or errors later. Consider throwing or returning a boolean to indicate success.

🛡️ Suggested improvement
 function advancePast(ctx: Ctx, str: string) {
   const i = ctx.msg.indexOf(str, ctx.pos);
-  if (i >= 0) ctx.pos = i + str.length;
+  if (i === -1) {
+    throw new Error(`Expected "${str}" at pos ${ctx.pos} in msg`);
+  }
+  ctx.pos = i + str.length;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function advancePast(ctx: Ctx, str: string) {
const i = ctx.msg.indexOf(str, ctx.pos);
if (i >= 0) ctx.pos = i + str.length;
}
function advancePast(ctx: Ctx, str: string) {
const i = ctx.msg.indexOf(str, ctx.pos);
if (i === -1) {
throw new Error(`Expected "${str}" at pos ${ctx.pos} in msg`);
}
ctx.pos = i + str.length;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 268 - 271, The function
advancePast(ctx: Ctx, str: string) currently silently does nothing if str isn't
found; change its signature to return a boolean (true if found and ctx.pos
advanced, false otherwise) and update callers to check the return value and
handle the failure (either propagate an error or throw at the appropriate parse
boundary). Locate advancePast and all call sites that pass ctx.msg/ctx.pos,
implement the boolean return, and add explicit error handling where a missing
delimiter would desynchronize parsing (or throw from the caller if preferred).

Comment on lines +24 to +26
expect(result.ok).toBe(true);
expect(result.sizeNew).toBeLessThan(result.sizeOld);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Size reduction assertion may be too strict for edge cases.

The assertion expect(result.sizeNew).toBeLessThan(result.sizeOld) assumes compact is always smaller. For very short messages (e.g., single character), the compact format's structural overhead may actually be larger. Consider using toBeLessThanOrEqual or adding a minimum message length threshold.

🛡️ Suggested fix
   expect(result.ok).toBe(true);
-  expect(result.sizeNew).toBeLessThan(result.sizeOld);
+  // For very short messages, compact overhead may equal or slightly exceed verbose
+  expect(result.sizeNew).toBeLessThanOrEqual(result.sizeOld);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expect(result.ok).toBe(true);
expect(result.sizeNew).toBeLessThan(result.sizeOld);
}
expect(result.ok).toBe(true);
expect(result.sizeNew).toBeLessThanOrEqual(result.sizeOld);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/compact.test.ts` around lines 24 - 26, The
size-reduction assertion in the test is too strict for tiny inputs: update the
expectation in packages/message-parser/tests/compact.test.ts to allow equality
for edge cases by replacing expect(result.sizeNew).toBeLessThan(result.sizeOld)
with expect(result.sizeNew).toBeLessThanOrEqual(result.sizeOld), or
alternatively add a minimum message-length guard so the check only runs when the
original message length (or result.sizeOld) exceeds a small threshold; ensure
you update the assertion around the same `result` variable used in the test.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/message-parser/src/compact.ts">

<violation number="1" location="packages/message-parser/src/compact.ts:270">
P2: `advancePast` silently no-ops when the target string is not found (`indexOf` returns -1), which leaves `ctx.pos` unchanged. This can silently desynchronize the cursor position, causing all subsequent `spanFor` / `advance` calls to produce incorrect spans. Unlike `spanFor` which throws on missing text, `advancePast` masks the error. Consider throwing an error (consistent with `spanFor`'s behavior) so that bugs in delimiter expectations surface immediately rather than producing subtly corrupted compact output.</violation>

<violation number="2" location="packages/message-parser/src/compact.ts:426">
P1: Bug: `COLOR` case in `compactInline` does not advance `ctx.pos` past the color source text (e.g., `color:#ff0000`), which corrupts span tracking for all subsequent sibling nodes in the same paragraph. Every other inline case advances the position, but this one is missing.</violation>
</file>

<file name="packages/message-parser/src/grammar.pegjs">

<violation number="1" location="packages/message-parser/src/grammar.pegjs:133">
P1: `CodeChunkChar` rejects code lines ending with one or two backticks, causing valid fenced code blocks to fail parsing.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


CodeChunk = text:$(!EndOfLine !"```" .)+ { return plain(text); }
// Charclass avoids per-char lookahead; never consume start of "```"
CodeChunkChar = [^\r\n`] / "`" [^`\r\n] / "`" "`" [^`\r\n]
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: CodeChunkChar rejects code lines ending with one or two backticks, causing valid fenced code blocks to fail parsing.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/message-parser/src/grammar.pegjs, line 133:

<comment>`CodeChunkChar` rejects code lines ending with one or two backticks, causing valid fenced code blocks to fail parsing.</comment>

<file context>
@@ -132,7 +129,9 @@ CodeLine
 
-CodeChunk = text:$(!EndOfLine !"```" .)+ { return plain(text); }
+// Charclass avoids per-char lookahead; never consume start of "```"
+CodeChunkChar = [^\r\n`] / "`" [^`\r\n] / "`" "`" [^`\r\n]
+CodeChunk = text:$(CodeChunkChar)+ { return plain(text); }
 
</file context>
Fix with Cubic

return { t: 'ts', r, f: node.value.format };
}

case 'COLOR':
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Bug: COLOR case in compactInline does not advance ctx.pos past the color source text (e.g., color:#ff0000), which corrupts span tracking for all subsequent sibling nodes in the same paragraph. Every other inline case advances the position, but this one is missing.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/message-parser/src/compact.ts, line 426:

<comment>Bug: `COLOR` case in `compactInline` does not advance `ctx.pos` past the color source text (e.g., `color:#ff0000`), which corrupts span tracking for all subsequent sibling nodes in the same paragraph. Every other inline case advances the position, but this one is missing.</comment>

<file context>
@@ -0,0 +1,615 @@
+			return { t: 'ts', r, f: node.value.format };
+		}
+
+		case 'COLOR':
+			return { t: 'c', v: [node.value.r, node.value.g, node.value.b, node.value.a] };
+
</file context>
Fix with Cubic


function advancePast(ctx: Ctx, str: string) {
const i = ctx.msg.indexOf(str, ctx.pos);
if (i >= 0) ctx.pos = i + str.length;
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: advancePast silently no-ops when the target string is not found (indexOf returns -1), which leaves ctx.pos unchanged. This can silently desynchronize the cursor position, causing all subsequent spanFor / advance calls to produce incorrect spans. Unlike spanFor which throws on missing text, advancePast masks the error. Consider throwing an error (consistent with spanFor's behavior) so that bugs in delimiter expectations surface immediately rather than producing subtly corrupted compact output.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/message-parser/src/compact.ts, line 270:

<comment>`advancePast` silently no-ops when the target string is not found (`indexOf` returns -1), which leaves `ctx.pos` unchanged. This can silently desynchronize the cursor position, causing all subsequent `spanFor` / `advance` calls to produce incorrect spans. Unlike `spanFor` which throws on missing text, `advancePast` masks the error. Consider throwing an error (consistent with `spanFor`'s behavior) so that bugs in delimiter expectations surface immediately rather than producing subtly corrupted compact output.</comment>

<file context>
@@ -0,0 +1,615 @@
+
+function advancePast(ctx: Ctx, str: string) {
+	const i = ctx.msg.indexOf(str, ctx.pos);
+	if (i >= 0) ctx.pos = i + str.length;
+}
+
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature Pull requests that introduces new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant