feat: add configurable output formats for inspection reports#655
feat: add configurable output formats for inspection reports#655bendichter wants to merge 3 commits intodevfrom
Conversation
Add support for multiple report output formats based on file extension: - .md files use Markdown format with #/##/### headings - .html/.htm files use styled HTML with color-coded importance levels - All other extensions default to RST format Introduce MessageFormatter base class with RstFormatter, MarkdownFormatter, and HtmlFormatter implementations for programmatic use. Update CLI and library documentation with usage examples. Closes #153
for more information, see https://pre-commit.ci
h-mayorquin
left a comment
There was a problem hiding this comment.
Hi, this took me a while to look at because I did not know this section of the code base well.
My first concern is about purpose. Why are we adding HTML and Markdown as output formats? Currently, the Markdown and HTML formatters are mostly just RST embedded into those two formats. The section headers change per format, but the actual message bodies (the important part of the report) come out as RST-style plain text in all three. If we just need a functional version of those formats, I think we can use pypandoc as an optional extra in pyproject.toml and delegate the conversion from RST to pandoc. This is actually what the original issue (#153) suggested. It produces more idiomatic Markdown and HTML and costs us nothing in terms of maintenance. If we want something more specific and tailored, we can do that in fhe future but we take this as a baseline. What do you think?
My second line of thinking is about class design. Currently, the base class has the method _add_subsection, whose name is rather obscure and whose responsibilities are overloaded. It does the recursive tree traversal to enumerate sections, but it also handles all leaf-level message formatting and the RST-specific grouping/collapsing logic ("and N other files"). I think the only thing the base class needs is the tree traversal to enumerate sections. The rest (how messages are rendered at the leaves, whether grouping happens) can be deferred to the child classes via an overridable method like _format_leaf_messages. I don't think the refactoring is necessary for this PR, but I wanted to discuss it with you. I can do it in another PR.
To make the point about non-idiomatic HTML concrete, here is a list of issues with the current HTML formatter as extracted by an agent:
- Messages render as plain text inside HTML.
_add_subsectionoutputs lines like0.0.1 sub-01.nwb: check_name - 'TimeSeries'...with no HTML tags wrapping them. The CSS defines.message,.message-header,.message-contentclasses but they are never used because the base class emits plain text, not HTML elements. - The summary is styled but the content is not.
_format_report_summaryproduces proper HTML with CSS classes, but the actual inspection messages come from the inherited_add_subsectionas plain text. The result is a nicely styled header followed by a wall of unstyled text. - It inherits RST collapsing. The "and N other files" logic runs for HTML output. HTML should either show all messages or use collapsible
<details>elements, not RST-style collapsed text wrapped in heading tags. - 30+ lines of hardcoded inline CSS in
_get_report_prefixwith no way to customize without subclassing and overriding the entire prefix.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #655 +/- ##
==========================================
+ Coverage 79.87% 83.67% +3.80%
==========================================
Files 48 48
Lines 1873 1930 +57
==========================================
+ Hits 1496 1615 +119
+ Misses 377 315 -62
🚀 New features to boost your workflow:
|
|
ok, fair point that we can offload this to pypandoc. I'm not so much interested in .doc. I'm actually much more interested in html, because I want to render inspection reports and potentially include data visualizations like the neurosift time alignment vis |
Add support for multiple report output formats based on file extension:
Introduce MessageFormatter base class with RstFormatter, MarkdownFormatter, and HtmlFormatter implementations for programmatic use.
Update CLI and library documentation with usage examples.
Closes #153