Markdown support #274

gkreitz · 2025-05-13T08:28:34Z

guess is a legacy problem, which does/should not support markdown. I'd like for our examples to follow the standard.

One way to fix that would be to convert guess and/or oddecho (where you also added a markdown statement) to 2023-07-draft.

-Original file line number
+Diff line change
@@ Expand Up / @@ -29,6 +29,7 @@ jobs: @@
             python -m pip install --upgrade pip
             pip install mypy ruff pytest
             if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+            sudo apt-get install pandoc tidy ghostscript python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic
         - name: Lint with ruff
           run: ruff check --output-format=github
         - name: Check ruff formatting
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -14,7 +14,9 @@ RUN apt-get update && \ @@
                 libgmp10 \
                 libgmpxx4ldbl \
                 openjdk-8-jdk \
+                pandoc \
                 python3-minimal \
+                python-nh3 \
                 python3-pip \
                 python3-plastex \
                 python3-yaml \
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up @@
     And the dependencies needed to *run* problemtools can be installed with:
-        sudo apt install ghostscript python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic tidy dvisvgm
+        sudo apt install ghostscript pandoc python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic tidy dvisvgm
     ### Fedora
     On Fedora, these dependencies can be installed with:
-        sudo dnf install boost-regex gcc gmp-devel gmp-c++ python3 python3-pyyaml texlive-latex texlive-collection-fontsrecommended texlive-fancyhdr texlive-subfigure texlive-wrapfig texlive-import texlive-ulem texlive-xifthen texlive-overpic texlive-pbox tidy ghostscript
+        sudo dnf install boost-regex gcc gmp-devel gmp-c++ pandoc python3 python3-pyyaml texlive-latex texlive-collection-fontsrecommended texlive-fancyhdr texlive-subfigure texlive-wrapfig texlive-import texlive-ulem texlive-xifthen texlive-overpic texlive-pbox tidy ghostscript
     Followed by:
-        pip3 install --user plastex
+        pip3 install --user plastex nh3
     ### Arch
     Package is available on the AUR [kattis-problemtools-git](https://aur.archlinux.org/packages/kattis-problemtools-git). Use your favorite AUR helper or follow the installation instructions found [here](https://wiki.archlinux.org/title/Arch_User_Repository#Installing_and_upgrading_packages).
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -25,6 +25,7 @@ RUN apt update && \ @@
             libgmp-dev \
             libgmp10 \
             libgmpxx4ldbl \
+            pandoc \
             python3 \
             python3-pytest \
             python3-setuptools \
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -23,6 +23,7 @@ RUN apt-get update && \ @@
                 mono-complete \
                 nodejs \
                 ocaml-nox \
+                pandoc \
                 php-cli \
                 pypy \
                 rustc \
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Markdown support #274

Uh oh!

Diff view

Diff view

There are no files selected for viewing

gkreitz May 13, 2025

Uh oh!

Uh oh!

-Original file line number
+Diff line change
@@ Expand Up / @@ -8,7 +8,7 @@ Homepage: https://github.com/Kattis/problemtools @@
     Package: kattis-problemtools
     Architecture: any
-    Depends: ${shlibs:Depends}, ${misc:Depends}, python3, texlive-plain-generic, texlive-fonts-recommended, texlive-latex-extra, texlive-lang-cyrillic, tidy, ghostscript, dvisvgm
+    Depends: ${shlibs:Depends}, ${misc:Depends}, pandoc, python3, texlive-plain-generic, texlive-fonts-recommended, texlive-latex-extra, texlive-lang-cyrillic, tidy, ghostscript, dvisvgm
     Recommends: gcc, g++
     Description: Kattis Problem Tools
      These are tools to manage and verify problem packages in the
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -24,4 +24,5 @@ more than one language. @@
     ## oddecho
     This is an example of a *scoring* problem where submissions can get
-    different scores depending on which test groups they solve. It also demonstrates how an input validator might check different constraints for different test groups.
+    different scores depending on which test groups they solve. It also demonstrates how an input validator might check different constraints for different test groups. The swedish statement showcases how to use images, footnotes
+    and tables in Markdown.

-Original file line number
+Diff line change
@@ Expand Up / @@ -5,6 +5,8 @@ @@
     ## Author of the problem (default: null)
     # author:
+    name: A Different Problem
     ## Where the problem was first used (default: null)
     source: Kattis
     # source_url:
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -2,6 +2,7 @@ source: Kattis @@
     license: cc by-sa
     validation: custom interactive
+    name: Guess the Number
     # Override standard limits: say that the TLE solutions provided should
     # be at least 4 times above the time limit in order for us to be
@@ Expand Down @@

-Original file line number
+Diff line change
@@ -0,0 +1,20 @@
+    Jag tänker på ett hemligt tal mellan $1$ and $100$, kan du gissa vilket?
+    Givet en gissning kommer jag att berätta om din gissning
+    var för stor, för liten eller rätt. Du får bara $10$ gissningar, använd
+    dem klokt!
+    ## Interaktion
+    Ditt program ska skriva ut gissningar om talet.
+    En gissning är en rad som enbart innehåller ett heltal mellan $1$ och $1000$.
+    Efter varje gissning måste du flusha standard out.
+    Efter varje gissning kan du läs svaret på standard in.
+    Detta svar är ett av tre ord:
+    - `lower` om talet jag tänker på är lägre än din gissning,
+    - `higher` om talet jag tänker på är högre än din gissning, eller
+    - `correct` om din gissning är korrekt.
+    Efter att ha gissat rätt ska du avsluta ditt program.
+    Om du gissar fel $10$ gånger får du inga fler chanser och ditt program kommer avbrytas.

-Original file line number
+Diff line change
@@ -1,5 +1,6 @@
     source: Kattis
     license: public domain
+    name: Hello World!
     # Fix memory limit at 512 MB.  (Note that for most problems, this
     # should not be done.  It is only done in this case because we include
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -2,6 +2,6 @@ license: cc by-sa @@
     author: Johan Sannemo
     source: Principles of Algorithmic Problem Solving
     type: scoring
-    name: Echo
+    name: Odd Echo
     grading:
         show_test_data_groups: true

-Original file line number
+Diff line change
@@ -0,0 +1,33 @@
+    **EKO! Eko! Ek...**
+    ![CC-BY-SA 2.0 By William Craig on wikimedia.org](cave.jpg)
+    Du älskar att skrika i grottor för att höra dina ord ekade tillbaka till dig. Tyvärr, som en hårt arbetande mjukvaruingenjör, har du
+    inte tid för att komma ut och skrika i grottor så ofta. Istället skulle du vilja implementera ett program som fungerar som en ersättning för en grotta.
+    Ibland vill du mata in några ord i programmet och få dem ekade tillbaka till dig. Men, som det är välkänt, om du skriker för snabbt i en grotta kan ekot störa de nya ord du säger. [^1] Mer specifikt, vartannat ord du säger kommer att störa ekot av ditt tidigare ord. Därför kommer endast det första, tredje, femte och så vidare ordet faktiskt att producera ett eko.
+    Din uppgift är att skriva ett program som simulerar detta beteende.
+    ## Indata
+    Den första raden av indata innehåller ett heltal $N$ ($1 \le N \le 10$).
+    De följande $N$ raderna innehåller vardera ett ord. Varje ord är högst $100$ bokstäver långt och innehåller endast bokstäverna `a-z`.
+    ## Utdata
+    Skriv ut de ord som har udda index (dvs. första, tredje, femte och så vidare) i inmatningen.
+    ## Poängsättning
+    Din lösning kommer att testas på en mängd testfallsgrupper.
+    För att få poäng för en grupp så måste du klara alla testfall i gruppen.
+    | Grupp | Poäng | Begränsningar            |
+    |-------|-------|--------------------------|
+    | 1     | 1     | $N$ är alltid $5$        |
+    | 2     | 1     | Inga ytterligare begränsningar |
+    [^1]: [https://sv.wikipedia.org/wiki/Interferens](https://sv.wikipedia.org/wiki/Interferens)

-Original file line number
+Diff line change
@@ -0,0 +1,169 @@
+    #! /usr/bin/env python3
+    # -*- coding: utf-8 -*-
+    import argparse
+    import html
+    import os
+    from pathlib import Path
+    import re
+    import shutil
+    import string
+    import subprocess
+    import nh3
+    from . import statement_util
+    def convert(problem: str, options: argparse.Namespace) -> bool:
+        """Convert a Markdown statement to HTML
+        Args:
+            problem: path to problem directory
+            options: command-line arguments. See problem2html.py
+        """
+        problembase = os.path.splitext(os.path.basename(problem))[0]
+        destfile = string.Template(options.destfile).safe_substitute(problem=problembase)
+        statement_path = statement_util.find_statement(problem, extension='md', language=options.language)
+        if statement_path is None:
+            raise FileNotFoundError('No markdown statement found')
+        if not os.path.isfile(statement_path):
+            raise FileNotFoundError(f'Error! {statement_path} does not exist')
+        command = ['pandoc', statement_path, '-t', 'html', '--mathjax']
+        statement_html = subprocess.run(command, capture_output=True, text=True, shell=False, check=True).stdout
+        statement_html = sanitize_html(problem, statement_html)
+        templatepaths = [
+            os.path.join(os.path.dirname(__file__), 'templates/markdown_html'),
+            '/usr/lib/problemtools/templates/markdown_html',
+        ]
+        templatepath = next(
+            (p for p in templatepaths if os.path.isdir(p) and os.path.isfile(os.path.join(p, 'default-layout.html'))), None
+        )
+        if templatepath is None:
+            raise FileNotFoundError('Could not find directory with markdown templates')
+        with open(Path(templatepath) / 'default-layout.html', 'r', encoding='utf-8') as template_file:
+            template = template_file.read()
+        problem_name = statement_util.get_yaml_problem_name(problem, options.language)
+        substitution_params = {
+            'statement_html': statement_html,
+            'language': options.language,
+            'title': html.escape(problem_name) if problem_name else 'Missing problem name',
+            'problemid': html.escape(problembase),
+        }
+        statement_html = template % substitution_params
+        samples = statement_util.format_samples(problem)
+        # Insert samples at {{nextsample}} and {{remainingsamples}}
+        statement_html, remaining_samples = statement_util.inject_samples(statement_html, samples)
+        # Insert the remaining samples at the bottom
+        # However, footnotes should be below samples
+        sample_insertion_position = statement_util.find_footnotes(statement_html)
+        if sample_insertion_position is None:
+            # No footnotes, so insert at the end
+            sample_insertion_position = statement_html.rfind('</body>')
+        statement_html = (
+            statement_html[:sample_insertion_position] + ''.join(remaining_samples) + statement_html[sample_insertion_position:]
+        )
+        with open(destfile, 'w', encoding='utf-8', errors='xmlcharrefreplace') as output_file:
+            output_file.write(statement_html)
+        if options.css:
+            shutil.copyfile(os.path.join(templatepath, 'problem.css'), 'problem.css')
+        return True
+    def sanitize_html(problem: str, statement_html: str):
+        # Allow footnote ids (the anchor points you jump to)
+        def is_fn_id(s):
+            pattern_id_top = r'^fn\d+$'
+            pattern_id_bottom = r'^fnref\d+$'
+            return bool(re.fullmatch(pattern_id_top, s)) or bool(re.fullmatch(pattern_id_bottom, s))
+        allowed_classes = ('sample', 'problemheader', 'problembody', 'sampleinteractionwrite', 'sampleinteractionread')
+        def is_image_valid(problem_root: str, img_src: str) -> str | None:
+            # Check that the image exists and uses an allowed extension
+            extension = Path(img_src).suffix
+            # TODO: fix svg sanitization and allow svg
+            if extension not in statement_util.ALLOWED_IMAGE_EXTENSIONS:
+                return f'Unsupported image extension {extension} for image {img_src}'
+            source_file = Path(problem_root) / 'statement' / img_src
+            if not source_file.exists():
+                return f'Resource file {img_src} not found in statement'
+            return None
+        # Annoying: nh3 will ignore exceptions in attribute_filter
+        image_fail_reason: str | None = None
+        def attribute_filter(tag, attribute, value):
+            if attribute == 'class' and value in allowed_classes:
+                return value
+            # Never versions of Pandoc will give class="footnotes footnotes-end-of-document"
+            # We don't want to blindly allow any class with footnotes in it, so only allow footnotes
+            if attribute == 'class' and 'footnotes' in value:
+                return 'footnotes'
+            if tag == 'a' and attribute == 'href':
+                return value
+            if tag in ('li', 'a') and attribute == 'id' and is_fn_id(value):
+                return value
+            if tag == 'img' and attribute == 'src':
+                fail = is_image_valid(problem, value)
+                if fail:
+                    nonlocal image_fail_reason
+                    image_fail_reason = fail
+                    return None
+                copy_image(problem, value)
+                return value
+            return None
+        statement_html = nh3.clean(
+            statement_html,
+            link_rel='noopener nofollow noreferrer',
+            attribute_filter=attribute_filter,
+            tags=nh3.ALLOWED_TAGS | {'img', 'a', 'section'},
+            attributes={
+                'table': {'class'},
+                'aside': {'class'},
+                'div': {'class'},
+                'section': {'class'},
+                'img': {'src'},
+                'a': {'href', 'id'},
+                'li': {'id'},
+            },
+        )
+        if image_fail_reason:
+            assert isinstance(image_fail_reason, str)
+            if 'Unsupported' in image_fail_reason:
+                raise ValueError(image_fail_reason)
+            raise FileNotFoundError(image_fail_reason)
+        return statement_html
+    def copy_image(problem_root: str, img_src: str) -> None:
+        """Copy image to output directory
+        Args:
+            problem_root: the root of the problem directory
+            img_src: the image source as in the Markdown statement
+        """
+        source_name = os.path.join(problem_root, 'statement', img_src)
+        if os.path.isfile(img_src):  # already copied
+            return
+        shutil.copyfile(source_name, img_src)

-Original file line number
+Diff line change
@@ Expand Up / @@ -113,7 +113,7 @@ class Metadata2023_07(BaseModel): @@
         problem_format_version: str
         name: dict[str, str] | str
-        uuid: UUID
+        uuid: UUID | None = None  # UUID *is* mandatory, but we deal with that in verifyproblem for better UX
         type: list[ProblemType] | ProblemType = ProblemType.PASS_FAIL
         version: str | None = None
         credits: dict | str | None = None
@@ Expand Down @@

Markdown support #274

Uh oh!

Markdown support #274

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

gkreitz May 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!