Skip to content

Conversation

@davesnx
Copy link
Contributor

@davesnx davesnx commented May 1, 2025

Hi,

This PR might come out of the blue, but there's a good reason. Let me explain myself in the "Why" below.

Also, when I was half done, I discovered a stale PR (#791) targeting the same problem, and I thought it could be helpful to other people.

Why

We use odoc in Melange's documentation, to generate API references for Melange libraries, such as Belt, Js, and Stdlib. While the rest of the documentation is handled by vitepress a static site generator working with Markdown files, with a lot of interesting features.

This works reasonably well, but creates a big barrier between the documentation sites:

  • Can't reference the melange.re docs articles from inside mld
  • Search from melange.re docs don't find anything inside the generated HTML
  • Users go from a unified experience to a different experience, melange.re is a SPA while generated HTML is an MPA
  • Navigation on the header and sidebar isn't the same and can't be unified
  • and a few more small details, such as favicon, tracking scripts, same syntax highlight and all features that a static site generator may give

For this reason, I wanted to generate markdown from odoc and (in the melange case) have another process that manipulates those markdown files to expand the documentation.

Another big use case for Markdown generation is Github, or any markdown renderer platform really.

How

I forked the HTML backend into a markdown2, removed what's unnecessary, and implemented Markdown construction with cmarkit.

I currently name it markdown2 since there is odoc_md and doc_of_md under the markdown folder. I suggest renaming markdown -> odoc-md (or just md) and markdown2 -> markdown in a new PR

Notes about implementation

  • References aren't a thing in Markdown's headings. I removed them, but I could always render HTML and keep the references, but it would be better to have an option for this. WDYT about --allow-html where I enable this functionality of outputting HTML
  • Similar cases for video/audio. Which isn't supported in pure markdown, but is supported in the CommonMark spec. We could fallback to HTML
  • I left a few TODOS in the code that are mostly doubts/questions about odoc that could be good to have a 2nd pair of eyes
  • Testing has been done in a few cram tests (mostly markdown-with-belt.t and markdown.t). I could probably split them or organise them differently.

Example

melange-re/melange-re.github.io#224

Future work

davesnx added 28 commits March 7, 2025 13:36
* 'master' of github.com:/ocaml/odoc: (31 commits)
  Update docs
  Parser: Ensure parser can be called concurrently
  Parser: Fixes following PR review
  Parser: more tests
  Parser: Allow \ddd escape sequence in code-block metadata
  Parser: Use lexer for quoted strings in code block metadata
  Simplify code-block tag types and parser
  Warn on escaped character that does not need escaping
  Tag parsing: Unescape everything
  Make ocamlformat ignore cppo file
  Fix odoc_of_md wrt new tag type
  Fix typo
  Disable extract-code on OCaml < 4.10
  Extract code: handle error
  Add location to the whole tag block
  Add location to binding key and value
  Add location for tag/binding
  Fix unescaping of quoted strings
  Parse code block tags
  OCaml 4.14 compatibility
  ...
davesnx added 5 commits June 6, 2025 13:23
…-cmarkit

* 'markdown-output' of github.com:davesnx/odoc:
  Remove copyright from markdown generation
  Use cppo to hide Generator from dune build
  remove 'tree' from cram
  Install cmarkit on 4.14
  Expose Config and Generator
Implement our own markdown rendering
@davesnx
Copy link
Contributor Author

davesnx commented Jun 6, 2025

I updated this PR with davesnx#1 without the cmarkit library. I worked on a separate branch to make the review easier. I added an attribution comment, but should other attribution be made on the LICENSE?

Comment on lines +43 to +45
```ocaml
let x = 42
```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not very familiar with source-impl, but I wanted to add the implementation in case was helpful somewhow, would be nice to ensure this is correct

@jonludlam
Copy link
Member

I was playing around with this and found that the includes don't make it to the output. For example, with the file foo.mli containing:

module type X = sig
  type t = int
end

module type T = sig
  include X
end

the output file Foo-module-type-T.md just contains:

# Module type `Foo.T`

@jonludlam
Copy link
Member

I've also been playing with Claude code, so there's a commit fixing it on my fork here: jonludlam@926cca1 - do with that what you will :-)

@davesnx
Copy link
Contributor Author

davesnx commented Jul 2, 2025

I can't reproduce the issue, even though the code might make sense. Pushed a cram update here: df7f5f1

@jonludlam
Copy link
Member

I've dumped the contents of the two new modules in the test - do you see what I mean now?

@davesnx
Copy link
Contributor Author

davesnx commented Jul 3, 2025

Gotcha, includes on their respective pages

@jonludlam
Copy link
Member

Merged!

@jonludlam jonludlam closed this Jul 9, 2025
@jonludlam jonludlam mentioned this pull request Jul 9, 2025
7 tasks
jonludlam added a commit to jonludlam/opam-repository that referenced this pull request Jul 10, 2025
CHANGES:

### Added
- Exposed sherlodoc libraries for use in other projects (@jonludlam, ocaml/odoc#1349)
- OCaml 5.4.0 support (@Octachron, ocaml/odoc#1355)
- New arguments to LaTeX generator, --shorten-beyond-depth and
  --remove-functor-arg-link (@Octachron, ocaml/odoc#1337)
- New experimental markdown generator (@davesnx, ocaml/odoc#1341)

### Changed
- Remove cmdliner compatibility layer, no longer needed (@dbuenzli, ocaml/odoc#1328)
- Drop support for OCaml < 4.08 (@jonludlam, ocaml/odoc#1300)
- Allow referencing libraries from package added in `odoc-config.sexp`
  (@panglesd, ocaml/odoc#1343)
- Use full path in heading labels in LaTeX backend (@Octachron, ocaml/odoc#1332)
- Separate page from anchor in LaTeX labels to prevent collisions (@Octachron,
  ocaml/odoc#1337)

### Fixed
- Fix bug in parsing META files when there are no dependencies (@jonludlam, ocaml/odoc#1352)
- Fix ocaml/odoc#1335 - incorrect rendering when on medium screen size with no global
  sidebar (@lukemaurer, ocaml/odoc#1361)
- Fixed generation of occurrences for docs CI (@jonludlam, ocaml/odoc#1362)
jonludlam added a commit to jonludlam/opam-repository that referenced this pull request Jul 10, 2025
CHANGES:

- Exposed sherlodoc libraries for use in other projects (@jonludlam, ocaml/odoc#1349)
- OCaml 5.4.0 support (@Octachron, ocaml/odoc#1355)
- New arguments to LaTeX generator, --shorten-beyond-depth and
  --remove-functor-arg-link (@Octachron, ocaml/odoc#1337)
- New experimental markdown generator (@davesnx, ocaml/odoc#1341)

- Remove cmdliner compatibility layer, no longer needed (@dbuenzli, ocaml/odoc#1328)
- Drop support for OCaml < 4.08 (@jonludlam, ocaml/odoc#1300)
- Allow referencing libraries from package added in `odoc-config.sexp`
  (@panglesd, ocaml/odoc#1343)
- Use full path in heading labels in LaTeX backend (@Octachron, ocaml/odoc#1332)
- Separate page from anchor in LaTeX labels to prevent collisions (@Octachron,
  ocaml/odoc#1337)

- Fix bug in parsing META files when there are no dependencies (@jonludlam, ocaml/odoc#1352)
- Fix ocaml/odoc#1335 - incorrect rendering when on medium screen size with no global
  sidebar (@lukemaurer, ocaml/odoc#1361)
- Fixed generation of occurrences for docs CI (@jonludlam, ocaml/odoc#1362)
@@ -0,0 +1,608 @@
(* This module is based on cmarkit (https://github.com/dbuenzli/cmarkit) which is distributed under the ISC License. *)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this please be properly attributed. It's a bit baffling, I publish the code of my projects under one of the simplest license to abide to and people still fail to comply with it when they cut and paste or modify the code. The terms here seem pretty clear (emphasis is mine):

Copyright (c) 2020 The cmarkit programmers

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

Is it that complicated ?

Now I'm extra nice and won't complain if you add a simple spdx-license identifier rather than the full permission notice.

Basically just retain the simple header you can find in absolutely each of the sources you copy from. So that would be, at your formatting convenience:

(*---------------------------------------------------------------------------
   Copyright (c) 2020 The cmarkit programmers. All rights reserved.
   SPDX-License-Identifier: ISC
  ---------------------------------------------------------------------------*)

Or if you prefer:

(* Part of this code is:
   Copyright (c) 2020 The cmarkit programmers. All rights reserved.
   SPDX-License-Identifier: ISC *)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry, Daniel, you're right. I'll fix this.

jonludlam added a commit to jonludlam/opam-repository that referenced this pull request Jul 15, 2025
CHANGES:

### Added
- Exposed sherlodoc libraries for use in other projects (@jonludlam, ocaml/odoc#1349)
- OCaml 5.4.0 support (@Octachron, ocaml/odoc#1355)
- New arguments to LaTeX generator, --shorten-beyond-depth and
  --remove-functor-arg-link (@Octachron, ocaml/odoc#1337)
- New experimental markdown generator (@davesnx, ocaml/odoc#1341)

### Changed
- Remove cmdliner compatibility layer, no longer needed (@dbuenzli, ocaml/odoc#1328)
- Drop support for OCaml < 4.08 (@jonludlam, ocaml/odoc#1300)
- Allow referencing libraries from package added in `odoc-config.sexp`
  (@panglesd, ocaml/odoc#1343)
- Use full path in heading labels in LaTeX backend (@Octachron, ocaml/odoc#1332)
- Separate page from anchor in LaTeX labels to prevent collisions (@Octachron,
  ocaml/odoc#1337)

### Fixed
- Fix bug in parsing META files when there are no dependencies (@jonludlam, ocaml/odoc#1352)
- Fix ocaml/odoc#1335 - incorrect rendering when on medium screen size with no global
  sidebar (@lukemaurer, ocaml/odoc#1361)
- Fixed generation of occurrences for docs CI (@jonludlam, ocaml/odoc#1362)
- Partial fix for ocaml/odoc#1369 - ensure that we never create a link to a hidden page
  (@jonludlam, ocaml/odoc#1370)
jonludlam added a commit to jonludlam/opam-repository that referenced this pull request Jul 15, 2025
CHANGES:

- Exposed sherlodoc libraries for use in other projects (@jonludlam, ocaml/odoc#1349)
- OCaml 5.4.0 support (@Octachron, ocaml/odoc#1355)
- New arguments to LaTeX generator, --shorten-beyond-depth and
  --remove-functor-arg-link (@Octachron, ocaml/odoc#1337)
- New experimental markdown generator (@davesnx, ocaml/odoc#1341)

- Remove cmdliner compatibility layer, no longer needed (@dbuenzli, ocaml/odoc#1328)
- Drop support for OCaml < 4.08 (@jonludlam, ocaml/odoc#1300)
- Allow referencing libraries from package added in `odoc-config.sexp`
  (@panglesd, ocaml/odoc#1343)
- Use full path in heading labels in LaTeX backend (@Octachron, ocaml/odoc#1332)
- Separate page from anchor in LaTeX labels to prevent collisions (@Octachron,
  ocaml/odoc#1337)

- Fix bug in parsing META files when there are no dependencies (@jonludlam, ocaml/odoc#1352)
- Fix ocaml/odoc#1335 - incorrect rendering when on medium screen size with no global
  sidebar (@lukemaurer, ocaml/odoc#1361)
- Fixed generation of occurrences for docs CI (@jonludlam, ocaml/odoc#1362)
- Partial fix for ocaml/odoc#1369 - ensure that we never create a link to a hidden page
  (@jonludlam, ocaml/odoc#1370)
@dbuenzli dbuenzli mentioned this pull request Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants