-
Couldn't load subscription status.
- Fork 296
Docx Renderer Extension
flexmark-java Docx-Renderer extension
Renders the parsed Markdown AST to docx format using the docx4j library.
See the DocxConverterCommonMark Sample for code and Customizing Docx Rendering for an overview and information on customizing the styles.
Pegdown version can be found in DocxConverterPegdown Sample
EmojiExtension.USE_SHORTCUT_TYPE to EmojiShortcutType.GITHUB or
EmojiShortcutType.ANY_GITHUB_PREFERRED which causes GitHub provided images to be used.
Renders AST generated by flexmark-java parser. No special syntax is implemented by this extension.
-
.classNameon paragraph elements will set the docx styleId toclassNameif the style id is found. This allows using specific style ids to change formatting for paragraphs, special classespagebreakandtabare excluded. - page break via
{.pagebreak}attributes - tab via
{.tab}attributes - Use
{style=""}to set attributes on text or block elements. Only the following are processed:-
color- text color -
background-color- shade fill color, pattern always solid. -
font-family- not implemented -
font-size- in pt, rounded to nearest 1/2 pt. Unitsptis optional. -
font-weight- set/clear bold (if using numeric weights then >= 550 sets bold, less clears it) -
font-style- set/clear italic
-
- inline image alignment with
{align=}:-
left- left align, wrap text to right -
right- right align, wrap text to left -
center- center align, wrap text to left and right - else no wrapping around image, image inserted into text
-
artifact: flexmark-docx-converter
The following options are available:
Defined in DocxRenderer class:
-
CODE_HIGHLIGHT_SHADINGdefault"", when non-empty will use this color as a highlight, also overridesNO_CHARACTER_STYLESto true, see NOTE on Highlight Colors colors. -
CUSTOM_PROPERTIESdefaultCollections.emptyMap(), set toMap<String, String>containing map of property name to property value for custom properties to be set in document. reference. Needed in some cases for post processing. -
DEFAULT_LINK_RESOLVERdefaulttrue, use default link resolver, which uses theDOC_RELATIVE_URLandDOC_ROOT_URLoptions -
DEFAULT_TEMPLATE_RESOURCEdefault"/empty.xml", default template resource path -
DOC_EMOJI_IMAGE_VERT_OFFSETdefault-0.10, vertical offset of emoji image as a factor of line height at point of insertion. The final value is rounded to nearest pt so jumps of 1 pt for small changes of this value can occur. -
DOC_EMOJI_IMAGE_VERT_SIZEdefault1.05, size of emoji image as a factor of line height at point of insertion. -
DOC_RELATIVE_URLdefault"", the prefix to use for all relative URLs: not starting with protocol or/ -
DOC_ROOT_URLdefault"", the prefix to use for all absolute URLs: ones starting with/ -
ERROR_SOURCE_FILEdefault"", name of source file to use in error logs -
ERRORS_TO_STDERRdefaultfalse, log errors to stdout -
FORM_CONTROLSdefault"", set to name of form control reference to generate form controls with name given by this key[name]{.type attributes} -
LINEBREAK_ON_INLINE_HTML_BRdefaulttrue, convert inline HTML<br>to line break in the docx -
LOCAL_HYPERLINK_MISSING_FORMATdefault"Missing target id: #%s", when non-empty uses String.format() on the given string with the missing ref anchor as the argument to generate a tooltip for unresolved hyperlinks -
LOCAL_HYPERLINK_MISSING_HIGHLIGHTdefault"red", when non-empty will highlight unresolved hyperlinks local to the document with this color. see NOTE on Highlight Colors colors. -
LOCAL_HYPERLINK_SUFFIXdefault"", appends this suffix to in document hyperlink anchor reference. Needed in some cases for post processing. -
LOG_IMAGE_PROCESSINGdefaultfalse, log image processing errors -
MAX_IMAGE_WIDTHdefault0, max image width, 0 no max -
NO_CHARACTER_STYLESdefaultfalse, when true will not set character style but explicitly set the run values from the style -
NUMBERING_XMLdefaultgetResourceString("/numbering.xml"), default numbering section if missing in wordprocessing package -
PREFIX_WWW_LINKSdefaulttrue, controls whether links starting withwww.will be prefixed withhttps:// -
RENDER_BODY_ONLYdefaultfalse, when rendering to string will only output the body of the document part. Used for tests. -
STYLES_XMLdefaultgetResourceString("/styles.xml"), default styles section if missing in wordprocessing package -
TABLE_CAPTION_BEFORE_TABLEdefaultfalse, insert caption before table -
TABLE_CAPTION_TO_PARAGRAPHdefaulttrue, convert table captions to paragraphs, styled withTableCaptionstyle id -
TABLE_LEFT_INDENTdefault120, table left indent in twips -
TABLE_PREFERRED_WIDTH_PCTdefault0, preferred table width -
TABLE_STYLEdefault"", table font style -
TOC_GENERATEdefaultfalse, whether to generate TOC, even if no TOC Markdown element is present in the file -
TOC_INSTRUCTIONdefault"TOC \\o \"1-3\" \\h \\z \\u ", defines the instruction string used for the TOC element
Docx format requires a named color. Any color provided that does not match a named color will be converted to the closest named color.
When CODE_HIGHLIGHT_SHADING is set to "shade" then will use the closest named color taken
from the SourceText shade fill color if available.
Element styles:
-
ASIDE_BLOCK_STYLEdefault"AsideBlock", style to use for aside blocks -
BLOCK_QUOTE_STYLEdefault"Quotations", style to use for block quotes -
BOLD_STYLEdefault"StrongEmphasis", style to use for the markdown element -
BULLET_LIST_STYLEdefault"BulletList", numbering list style to use for bullet list item paragraph -
DEFAULT_STYLEdefault"Normal", style to use for the markdown element -
ENDNOTE_ANCHOR_STYLEdefault"EndnoteReference", style to use for the markdown element -
FOOTERdefault"Footer", style to use for the markdown element -
FOOTNOTE_ANCHOR_STYLEdefault"FootnoteReference", style to use for the markdown element -
FOOTNOTE_STYLEdefault"Footnote", style to use for footnote text -
FOOTNOTE_TEXTdefault"FootnoteText", style to use for the markdown element -
HEADERdefault"Header", style to use for the markdown element -
HEADING_1default"Heading1", style to use for the markdown element -
HEADING_2default"Heading2", style to use for the markdown element -
HEADING_3default"Heading3", style to use for the markdown element -
HEADING_4default"Heading4", style to use for the markdown element -
HEADING_5default"Heading5", style to use for the markdown element -
HEADING_6default"Heading6", style to use for the markdown element -
HORIZONTAL_LINE_STYLEdefault"HorizontalLine", style to use for thematic breaks -
HYPERLINK_STYLEdefault"Hyperlink", style to use for the markdown element -
INLINE_CODE_STYLEdefault"SourceText", style to use for the markdown element -
INS_STYLEdefault"Underlined", style to use for the markdown element -
ITALIC_STYLEdefault"Emphasis", style to use for the markdown element -
LOOSE_PARAGRAPH_STYLEdefault"ParagraphTextBody", style to use for loose list type items -
NUMBERED_LIST_STYLEdefault"NumberedList", numbering list style to use for numbered list item paragraph -
PARAGRAPH_BULLET_LIST_STYLEdefault"ListBullet", style to use for tight list type items -
PARAGRAPH_NUMBERED_LIST_STYLEdefault"ListNumber", style to use for tight list type items -
PREFORMATTED_TEXT_STYLEdefault"PreformattedText", style to use for fenced code and indented code -
STRIKE_THROUGH_STYLEdefault"Strikethrough", style to use for the markdown element -
SUBSCRIPT_STYLEdefault"Subscript", style to use for the markdown element -
SUPERSCRIPT_STYLEdefault"Superscript", style to use for the markdown element -
TABLE_CAPTIONdefault"TableCaption", style to use for table captions -
TABLE_CONTENTSdefault"TableContents", style to use for table bodies -
TABLE_GRIDdefault"TableGrid", style to use for the markdown element -
TABLE_HEADINGdefault"TableHeading", style to use for table headings -
TIGHT_PARAGRAPH_STYLEdefault"BodyText", style to use for tight list type items
List Element Styles
Unordered lists use numbering list style named BulletList while ordered lists use
NumberedList. If these are not present then default numbering style (id = 2) is used for
unordered lists and default numbering style (id = 3) is used for ordered lists.
The following are equivalent to Renderer properties of the same name. Included in
DocxRenderer for convenience.
For the TOC_INSTRUCTION string see
Docx4j GettingStarted under the
heading TOC Content Control
NOTE: Word does not handle inserted HTML very well. Any HTML not suppressed will be escaped: ie.
it will render into the document as text. The exception is for the <br> tag which if enabled
will be rendered as a line break.
Html rendering options available in DocxRenderer for convenience:
-
ESCAPE_HTML_BLOCKSdefault value ofESCAPE_HTML, escape html blocks found in the document -
ESCAPE_HTML_COMMENT_BLOCKSdefault value ofESCAPE_HTML_BLOCKS, escape html comment blocks found in the document. -
ESCAPE_HTMLdefaultfalse, escape all html found in the document -
ESCAPE_INLINE_HTML_COMMENTSdefault value ofESCAPE_HTML_BLOCKS, escape inline html found in the document -
ESCAPE_INLINE_HTMLdefault value ofESCAPE_HTML, escape inline html found in the document -
PERCENT_ENCODE_URLSdefaultfalse, percent encode urls -
RECHECK_UNDEFINED_REFERENCESdefaultfalse, Recheck the existence of refences inParser.REFERENCESfor link and image refs marked undefined. Used when new references are added after parsing -
SUPPRESS_HTML_BLOCKSdefault value ofSUPPRESS_HTML, suppress html output for html blocks -
SUPPRESS_HTML_COMMENT_BLOCKSdefault value ofSUPPRESS_HTML_BLOCKS, suppress html output for html comment blocks -
SUPPRESS_HTMLdefaultfalse, suppress html output for all html -
SUPPRESS_INLINE_HTML_COMMENTSdefault value ofSUPPRESS_INLINE_HTML, suppress html output for inline html comments -
SUPPRESS_INLINE_HTMLdefault value ofSUPPRESS_HTML, suppress html output for inline html -
HEADER_ID_GENERATOR_NO_DUPED_DASHESdefaultfalse, Whentrueduplicate-in id will be replaced by a single- -
HEADER_ID_GENERATOR_RESOLVE_DUPESdefaulttrue, Whentruewill add an incrementing integer to duplicate ids to make them unique -
HEADER_ID_GENERATOR_TO_DASH_CHARSdefault"_", set of characters to convert to-in text used to generate id, non-alpha numeric chars not in set will be removed -
HEADER_ID_GENERATOR_NON_ASCII_TO_LOWERCASE, defaulttrue. When set tofalsechanges the default header id generator to not convert non-ascii alphabetic characters to lowercase. Needed forGitHubid compatibility. -
HEADER_ID_REF_TEXT_TRIM_LEADING_SPACES, defaulttrue. When set tofalsethen leading spaces in link reference text in heading is not trimmed for text used to generate id. -
HEADER_ID_REF_TEXT_TRIM_TRAILING_SPACES, defaulttrue. When set tofalsethen trailing spaces in link reference text in heading is not trimmed for text used to generate id. -
HEADER_ID_ADD_EMOJI_SHORTCUT, defaultfalse. When set totrue, emoji shortcut nodes add the shortcut to collected text used to generate heading id. -
HEADER_ID_GENERATOR_TO_DASH_CHARSdefault"_", set of characters to convert to-in text used to generate id, non-alpha numeric chars not in set will be removed -
RENDER_HEADER_IDdefaultfalse, Render a header id attribute for headers using the configuredHtmlIdGenerator