I've seen some documents in the wild that have a <!-- ... --> comment block before the first html node (but after <!doctype>). I'm not super sure if that's "valid", but it is annoying that that's causing an error. Since it's a comment, I could definitely see that the error is basically given a restart to ignore any nodes before actual start of the document, alternatively those could be stored and handled once the document has been created?