Skip to content

Commit 393c0e8

Browse files
authored
Add comparison with GitHub Flavored Markdown spec (#1550)
I compiled the list with help from Claude. I tested some common elements manually but not all of them. That said, I think this would still be a good comparison reference to have.
1 parent 345c840 commit 393c0e8

File tree

2 files changed

+185
-3
lines changed

2 files changed

+185
-3
lines changed

doc/markup_reference/markdown.md

Lines changed: 104 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,9 @@ Use triple backticks with an optional language identifier:
9898
end
9999
```
100100

101-
Supported language for syntax highlighting: `ruby`, `rb` (alias to `ruby`), and `c`.
101+
Supported languages for syntax highlighting: `ruby` (and `rb` alias) with server-side
102+
highlighting, and `c`, `bash`/`sh`/`shell`/`console` with client-side JavaScript highlighting.
103+
Other info strings are accepted and added as a CSS class but receive no highlighting.
102104

103105
### Blockquotes
104106

@@ -420,6 +422,9 @@ For example:
420422
* [Link to Blockquotes](#blockquotes)
421423
* [Link to Anchor Links](#anchor-links)
422424

425+
When multiple headings produce the same anchor, RDoc appends `-1`, `-2`, etc.
426+
to subsequent duplicates, matching GitHub's behavior.
427+
423428
## Footnotes
424429

425430
### Reference Footnotes
@@ -535,7 +540,7 @@ See [rdoc.rdoc](rdoc.rdoc) for complete directive documentation.
535540
| Headings | `= Heading` | `# Heading` |
536541
| Bold | `*word*` | `**word**` |
537542
| Italic | `_word_` | `*word*` |
538-
| Monospace | `+word+` | `` `word` `` |
543+
| Monospace | `+word+` or `` `word` `` | `` `word` `` |
539544
| Links | `{text}[url]` | `[text](url)` |
540545
| Code blocks | Indent beyond margin | Indent 4 spaces or fence |
541546
| Block quotes | `>>>` | `>` |
@@ -551,8 +556,104 @@ See [rdoc.rdoc](rdoc.rdoc) for complete directive documentation.
551556

552557
3. **Footnotes are collapsed** - Multiple paragraphs in a footnote become a single paragraph.
553558

554-
4. **Syntax highlighting** - Only `ruby` and `c` are supported for fenced code blocks.
559+
4. **Syntax highlighting** - Only `ruby`/`rb` (server-side) and `c`, `bash`/`sh`/`shell`/`console` (client-side) receive syntax highlighting. Other info strings are accepted but not highlighted.
555560

556561
5. **Fenced code blocks** - Only triple backticks are supported. Tilde fences (`~~~`) are not supported as they conflict with strikethrough syntax. Four or more backticks for nesting are also not supported.
557562

558563
6. **Auto-linking** - RDoc automatically links class and method names in output, even without explicit link syntax.
564+
565+
## Comparison with GitHub Flavored Markdown (GFM)
566+
567+
This section compares RDoc's Markdown implementation with the
568+
[GitHub Flavored Markdown Spec](https://github.github.com/gfm/) (Version 0.29-gfm, 2019-04-06).
569+
570+
### Block Elements
571+
572+
| Feature | GFM | RDoc | Notes |
573+
|---------|:---:|:----:|-------|
574+
| ATX Headings (`#`) ||| Both support levels 1-6, optional closing `#` |
575+
| Setext Headings ||| `=` for H1, `-` for H2 |
576+
| Paragraphs ||| Full match |
577+
| Indented Code Blocks ||| 4 spaces or 1 tab |
578+
| Fenced Code (backticks) | ✅ 3+ | ⚠️ 3 only | RDoc doesn't support 4+ backticks for nesting |
579+
| Fenced Code (tildes) |`~~~` || Conflicts with strikethrough syntax |
580+
| Info strings (language) | ✅ any | ⚠️ limited | `ruby`/`rb`, `c`, and `bash`/`sh`/`shell`/`console` highlighted; others accepted as CSS class |
581+
| Blockquotes ||| Full match, nested supported |
582+
| Lazy Continuation || ⚠️ | Continuation text is included in blockquote but line break is lost (becomes a space) |
583+
| Bullet Lists ||| `*`, `+`, `-` supported |
584+
| Ordered Lists |`.` `)` | ⚠️ `.` only | RDoc doesn't support `)` delimiter; numbers are always renumbered from 1 |
585+
| Nested Lists ||| 4-space indentation |
586+
| Tables ||| Full alignment support |
587+
| Thematic Breaks ||| `---`, `***`, `___` |
588+
| HTML Blocks | ✅ 7 types | ⚠️ | See below |
589+
590+
#### HTML Blocks
591+
592+
GFM defines 7 types of HTML blocks:
593+
594+
| Type | Description | GFM | RDoc | Notes |
595+
|------|-------------|:---:|:----:|-------|
596+
| 1 | `<script>`, `<pre>` ||| |
597+
| 1 | `<style>` ||| Available via `css` extension (disabled by default) |
598+
| 2 | HTML comments `<!-- -->` ||| |
599+
| 3 | Processing instructions `<? ?>` ||| |
600+
| 4 | Declarations `<!DOCTYPE>` ||| |
601+
| 5 | CDATA `<![CDATA[ ]]>` ||| |
602+
| 6 | Block-level tags || ⚠️ | |
603+
| 7 | Any complete open/close tag ||| |
604+
605+
RDoc uses a whitelist of block-level tags defined in
606+
[lib/rdoc/markdown.kpeg](https://github.com/ruby/rdoc/blob/master/lib/rdoc/markdown.kpeg)
607+
(see `HtmlBlockInTags`). HTML5 semantic elements like `<article>`, `<section>`,
608+
`<nav>`, `<header>`, `<footer>` are not supported.
609+
610+
### Inline Elements
611+
612+
| Feature | GFM | RDoc | Notes |
613+
|---------|:---:|:----:|-------|
614+
| Emphasis `*text*` `_text_` || ⚠️ | Intraword emphasis not supported (see [Notes](#notes-and-limitations)) |
615+
| Strong `**text**` `__text__` ||| Full match |
616+
| Combined `***text***` ||| Full match |
617+
| Code spans ||| Multiple backticks supported |
618+
| Inline links ||| Full match |
619+
| Reference links ||| Full match |
620+
| Link titles || ⚠️ | Parsed but not rendered |
621+
| Images ||| Full match |
622+
| Autolinks `<url>` ||| Full match |
623+
| Hard line breaks || ⚠️ | 2+ trailing spaces only; backslash `\` at EOL not supported |
624+
| Backslash escapes || ⚠️ | Subset of GFM's escapable characters (e.g., `~` not escapable) |
625+
| HTML entities ||| Named, decimal, hex |
626+
| Inline HTML || ⚠️ | `<b>` converted to `<strong>`, `<i>` to `<em>`; `<strong>` itself is escaped |
627+
628+
### GFM Extensions
629+
630+
| Feature | GFM | RDoc | Notes |
631+
|---------|:---:|:----:|-------|
632+
| Strikethrough `~~text~~` ||| Full match |
633+
| Task Lists `[ ]` `[x]` ||| Not supported |
634+
| Extended Autolinks || ⚠️ | See below |
635+
| Disallowed Raw HTML ||| No security filtering |
636+
637+
#### GFM Extended Autolinks
638+
639+
GFM automatically converts certain text patterns into links without requiring
640+
angle brackets (`<>`). RDoc also auto-links URLs and `www.` prefixes through
641+
its cross-reference system, but the behavior differs from GFM.
642+
643+
GFM recognizes these patterns:
644+
645+
- `www.example.com` — text starting with `www.` followed by a valid domain
646+
- `https://example.com` — URLs starting with `http://` or `https://`
647+
- `user@example.com` — valid email addresses
648+
649+
RDoc auto-links `www.` prefixes and `http://`/`https://` URLs similarly to GFM.
650+
However, bare email addresses like `user@example.com` are not auto-linked;
651+
use `<user@example.com>` instead.
652+
653+
### RDoc-Specific Features (not in GFM)
654+
655+
- [Definition Lists](#definition-lists)
656+
- [Footnotes](#footnotes)
657+
- [Cross-references](#cross-references)
658+
- [Anchor Links](#anchor-links)
659+
- [Directives](#directives)

test/rdoc/rdoc_markdown_test.rb

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1411,4 +1411,85 @@ def parse(text)
14111411
@parser.parse text
14121412
end
14131413

1414+
def render(markdown_source)
1415+
@to_html.convert(parse(markdown_source))
1416+
end
1417+
1418+
def test_atx_heading_closing_hashes_stripped
1419+
html = render("## Heading ##\n")
1420+
assert_match(%r{<h2.*>.*Heading.*</h2>}, html)
1421+
assert_not_match(/##/, html.gsub(/<[^>]+>/, "").strip)
1422+
end
1423+
1424+
def test_fenced_code_4_backticks_not_supported
1425+
html = render("````\ncode\n````\n")
1426+
assert_not_match(%r{<pre>code\n</pre>}, html)
1427+
end
1428+
1429+
def test_tilde_is_strikethrough_not_fence
1430+
html = render("~~~\ncode\n~~~\n")
1431+
assert_not_match(%r{<pre>code\n</pre>}, html)
1432+
1433+
html = render("~~strike~~\n")
1434+
assert_match(%r{<del>strike</del>}, html)
1435+
end
1436+
1437+
def test_info_string_css_classes
1438+
assert_match(/class="ruby"/, render("```rb\ndef hello; end\n```\n"))
1439+
assert_match(/class="c"/, render("```c\nint main() {}\n```\n"))
1440+
assert_match(/class="bash"/, render("```bash\necho hello\n```\n"))
1441+
assert_match(/class="python"/, render("```python\nprint('hi')\n```\n"))
1442+
end
1443+
1444+
def test_lazy_continuation_in_blockquote
1445+
html = render("> Foo\nBar\n")
1446+
assert_match(%r{<blockquote>.*Foo.*Bar.*</blockquote>}m, html)
1447+
assert_match(%r{Foo Bar}, html)
1448+
end
1449+
1450+
def test_ordered_list_paren_delimiter_not_supported
1451+
html = render("1) first\n2) second\n")
1452+
assert_not_match(%r{<ol>}, html)
1453+
end
1454+
1455+
def test_style_block_not_supported
1456+
html = render("<style>body { color: red; }</style>\n")
1457+
assert_not_match(%r{<style>}, html)
1458+
end
1459+
1460+
def test_inline_html_tag_conversion
1461+
assert_match(%r{<strong>bold</strong>}, render("This has <b>bold</b> HTML.\n"))
1462+
assert_match(%r{<em>emphasized</em>}, render("This has <em>emphasized</em> HTML.\n"))
1463+
1464+
html = render("This has <strong>bold</strong> HTML.\n")
1465+
assert_match(/&lt;strong&gt;/, html)
1466+
end
1467+
1468+
def test_link_title_not_rendered
1469+
html = render('[text](https://example.com "My Title")' + "\n")
1470+
assert_match(%r{<a href="https://example.com">text</a>}, html)
1471+
assert_not_match(/My Title/, html)
1472+
end
1473+
1474+
def test_task_list_not_supported
1475+
html = render("- [ ] unchecked\n- [x] checked\n")
1476+
assert_not_match(%r{<input}, html)
1477+
end
1478+
1479+
def test_autolinks
1480+
assert_match(%r{<a href.*www\.example\.com}, render("Visit www.example.com for help.\n"))
1481+
assert_match(%r{<a href="https://example\.com"}, render("Visit https://example.com for help.\n"))
1482+
assert_not_match(%r{<a href="mailto:user@example\.com"}, render("Contact user@example.com for help.\n"))
1483+
end
1484+
1485+
def test_backslash_line_break_not_supported
1486+
html = render("Line one\\\nLine two\n")
1487+
assert_not_match(%r{<br>}, html)
1488+
end
1489+
1490+
def test_escape_tilde_not_supported
1491+
html = render("\\~not escaped\n")
1492+
assert_match(/\\~/, html)
1493+
end
1494+
14141495
end

0 commit comments

Comments
 (0)