-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi!
In the Text variant of a Node, the text is stored as-is from the source code of the HTML file. This means that a source such as a > b would be represented as Node::Text("a > b"), rather than Node::Text("a > b"). While this does make sense for performance reasons, I feel like this might be unintuitive for users. The Node data-type is for manipulating HTML after it has been parsed into an abstract syntax tree, but here the Text variant store the text unprocessed from the file, rather than storing what the text represents feels.
Additionally, this means that one could easily construct a Node::Text instance by mistake which contains HTML fragments which when serialized, either give invalid HTML or something which would parse to a different tree structure (for example doing Node::Text("a > b"), or Node::Text("a <img> b"))
From what I can see, a solution to this problem would simply be to add a dependency such as html-escape and making a call to decode_html_entities in the parse function, as well as a call to encode_html_entities in the Htmlifiable::html implementation.
(All of this also applies to attribute values as well)