Skip to content

feat: add configurable output formats for inspection reports#655

Open
bendichter wants to merge 3 commits intodevfrom
output-md
Open

feat: add configurable output formats for inspection reports#655
bendichter wants to merge 3 commits intodevfrom
output-md

Conversation

@bendichter
Copy link
Copy Markdown
Contributor

Add support for multiple report output formats based on file extension:

  • .md files use Markdown format with #/##/### headings
  • .html/.htm files use styled HTML with color-coded importance levels
  • All other extensions default to RST format

Introduce MessageFormatter base class with RstFormatter, MarkdownFormatter, and HtmlFormatter implementations for programmatic use.

Update CLI and library documentation with usage examples.

Closes #153

Add support for multiple report output formats based on file extension:
- .md files use Markdown format with #/##/### headings
- .html/.htm files use styled HTML with color-coded importance levels
- All other extensions default to RST format

Introduce MessageFormatter base class with RstFormatter, MarkdownFormatter,
and HtmlFormatter implementations for programmatic use.

Update CLI and library documentation with usage examples.

Closes #153
Copy link
Copy Markdown
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this took me a while to look at because I did not know this section of the code base well.

My first concern is about purpose. Why are we adding HTML and Markdown as output formats? Currently, the Markdown and HTML formatters are mostly just RST embedded into those two formats. The section headers change per format, but the actual message bodies (the important part of the report) come out as RST-style plain text in all three. If we just need a functional version of those formats, I think we can use pypandoc as an optional extra in pyproject.toml and delegate the conversion from RST to pandoc. This is actually what the original issue (#153) suggested. It produces more idiomatic Markdown and HTML and costs us nothing in terms of maintenance. If we want something more specific and tailored, we can do that in fhe future but we take this as a baseline. What do you think?

My second line of thinking is about class design. Currently, the base class has the method _add_subsection, whose name is rather obscure and whose responsibilities are overloaded. It does the recursive tree traversal to enumerate sections, but it also handles all leaf-level message formatting and the RST-specific grouping/collapsing logic ("and N other files"). I think the only thing the base class needs is the tree traversal to enumerate sections. The rest (how messages are rendered at the leaves, whether grouping happens) can be deferred to the child classes via an overridable method like _format_leaf_messages. I don't think the refactoring is necessary for this PR, but I wanted to discuss it with you. I can do it in another PR.

To make the point about non-idiomatic HTML concrete, here is a list of issues with the current HTML formatter as extracted by an agent:

  • Messages render as plain text inside HTML. _add_subsection outputs lines like 0.0.1 sub-01.nwb: check_name - 'TimeSeries'... with no HTML tags wrapping them. The CSS defines .message, .message-header, .message-content classes but they are never used because the base class emits plain text, not HTML elements.
  • The summary is styled but the content is not. _format_report_summary produces proper HTML with CSS classes, but the actual inspection messages come from the inherited _add_subsection as plain text. The result is a nicely styled header followed by a wall of unstyled text.
  • It inherits RST collapsing. The "and N other files" logic runs for HTML output. HTML should either show all messages or use collapsible <details> elements, not RST-style collapsed text wrapped in heading tags.
  • 30+ lines of hardcoded inline CSS in _get_report_prefix with no way to customize without subclassing and overriding the entire prefix.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.51948% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.67%. Comparing base (60bfe7f) to head (7068496).
⚠️ Report is 1 commits behind head on dev.

Files with missing lines Patch % Lines
src/nwbinspector/_formatting.py 88.57% 8 Missing ⚠️
src/nwbinspector/_nwbinspector_cli.py 0.00% 7 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #655      +/-   ##
==========================================
+ Coverage   79.87%   83.67%   +3.80%     
==========================================
  Files          48       48              
  Lines        1873     1930      +57     
==========================================
+ Hits         1496     1615     +119     
+ Misses        377      315      -62     
Files with missing lines Coverage Δ
src/nwbinspector/_nwbinspector_cli.py 0.00% <0.00%> (ø)
src/nwbinspector/_formatting.py 82.63% <88.57%> (+12.28%) ⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bendichter
Copy link
Copy Markdown
Contributor Author

ok, fair point that we can offload this to pypandoc. I'm not so much interested in .doc. I'm actually much more interested in html, because I want to render inspection reports and potentially include data visualizations like the neurosift time alignment vis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: specifiable output format

3 participants