Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
cb3ea10
Add markdown support
Matistjati Aug 7, 2024
868eb39
Added display math
Matistjati Aug 7, 2024
6a01b1c
Add dependencies for markdown
Matistjati Aug 7, 2024
05f6372
Style markdown tables
Matistjati Aug 8, 2024
673773e
Remove temp files
Matistjati Aug 8, 2024
1c64085
Statement fix
Matistjati Aug 8, 2024
48d18c7
Some refactoring
Matistjati Aug 8, 2024
08645f5
Added image support in markdown
Matistjati Aug 9, 2024
a6a1933
Added footnote support
Matistjati Aug 9, 2024
7627c58
Code cleanup
Matistjati Aug 9, 2024
1b222ac
md -> html works
Matistjati Aug 13, 2024
712ce3e
Make md styling more constistent with latex
Matistjati Aug 16, 2024
11a2e4c
md->pdf and Reorganize code
Matistjati Aug 17, 2024
480e0ea
Better md->pdf tables
Matistjati Aug 17, 2024
e9b3f8e
Interactive samples for pdf
Matistjati Aug 18, 2024
ad3e801
Remove bplusa
Matistjati Aug 18, 2024
30d9603
PDF problem name
Matistjati Aug 18, 2024
efc5c9e
Add dependencies
Matistjati Aug 18, 2024
762599f
Add problem names
Matistjati Aug 18, 2024
2bba9d4
Added problem name to test hello package
Matistjati Aug 18, 2024
cdd1804
Improve security by running pandoc without shell capabilities
Matistjati Aug 18, 2024
194c7b1
Refactoring
Matistjati Aug 18, 2024
554892a
Even more refactoring
Matistjati Aug 18, 2024
d8a4c3e
Remove python3-markdown dependency
Matistjati Aug 18, 2024
7390fb8
Add problem id to pdf and small fixes
Matistjati Aug 18, 2024
46a7003
Disable html
Matistjati Mar 11, 2025
770d5da
Change to wikimedia example image
Matistjati Mar 12, 2025
11b6a13
Sanitize image sources
Matistjati Mar 12, 2025
bfd4703
Remove SVG dependency
Matistjati Mar 12, 2025
d935771
Better markdown styling
Matistjati Mar 12, 2025
d55df47
Better sample styling
Matistjati Mar 12, 2025
a0b3f9f
Add \nextsample and \remainingsamples
Matistjati Mar 12, 2025
cc5f26e
Better pdf error handling
Matistjati Mar 12, 2025
608fe13
Use {{nextsample}} instead of \nextsample
Matistjati Mar 13, 2025
c3dc3c9
Relax image checking (implied by global regex on filenames)
Matistjati Mar 13, 2025
6f1698e
Add svg dependency
Matistjati Mar 13, 2025
c6b57c8
Start sanitization + apply feedback
Matistjati Apr 5, 2025
cfc285c
Better sanitization + lots of tests
Matistjati Apr 7, 2025
5f5d59d
problem_statement -> statement
Matistjati Apr 8, 2025
213f9ac
Better md -> pdf sample rendering
Matistjati Apr 8, 2025
d745f6e
Another escape
Matistjati Apr 8, 2025
d4e27a2
More careful with images
Matistjati Apr 8, 2025
fdde1a4
Make samplexss more focused
Matistjati Apr 8, 2025
3ded4a4
Experimentally reuse normal LaTeX rendering
Matistjati Apr 9, 2025
9134f30
Merge remote-tracking branch 'problemtools/develop' into pandoc
Matistjati Apr 9, 2025
79b5a5d
Use problemtools problem2pdf to handle md -> pdf
Matistjati Apr 9, 2025
fcda106
Cleanup
Matistjati Apr 9, 2025
47bda29
librsvg out of focus for this PR
Matistjati Apr 9, 2025
054448e
Ensure nh3
Matistjati Apr 9, 2025
ecdb6c4
Remove ghostscript sanitization. If it wasn't used before, it probabl…
Matistjati Apr 9, 2025
690215f
Add nh3 to deb build
Matistjati Apr 13, 2025
77cb2c9
Linting
Matistjati Apr 13, 2025
2e7653f
Add back ghostscript sanitization
Matistjati Apr 13, 2025
51f5539
Remove unnecessary test
Matistjati Apr 13, 2025
898a786
Merged with developed
Matistjati May 12, 2025
63fd2e8
Add nh3 as dependency
Matistjati May 12, 2025
5f96852
Fix test import path
Matistjati May 12, 2025
754f468
Apply ruff formatting
Matistjati May 12, 2025
3063cb0
More robust footnote finding
Matistjati May 12, 2025
a6be656
Don't double-escape HTML in samples
Matistjati May 14, 2025
14e0c84
Ghostscript fixes and tests
Matistjati May 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ jobs:
python -m pip install --upgrade pip
pip install mypy ruff pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
sudo apt-get install pandoc tidy ghostscript python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic
- name: Lint with ruff
run: ruff check --output-format=github
- name: Check ruff formatting
Expand Down
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ RUN apt-get update && \
libgmp10 \
libgmpxx4ldbl \
openjdk-8-jdk \
pandoc \
python3-minimal \
python-nh3 \
python3-pip \
python3-plastex \
python3-yaml \
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,17 +205,17 @@ The dependencies needed to *build/install* problemtools can be installed with:

And the dependencies needed to *run* problemtools can be installed with:

sudo apt install ghostscript python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic tidy dvisvgm
sudo apt install ghostscript pandoc python3 texlive-fonts-recommended texlive-lang-cyrillic texlive-latex-extra texlive-plain-generic tidy dvisvgm

### Fedora

On Fedora, these dependencies can be installed with:

sudo dnf install boost-regex gcc gmp-devel gmp-c++ python3 python3-pyyaml texlive-latex texlive-collection-fontsrecommended texlive-fancyhdr texlive-subfigure texlive-wrapfig texlive-import texlive-ulem texlive-xifthen texlive-overpic texlive-pbox tidy ghostscript
sudo dnf install boost-regex gcc gmp-devel gmp-c++ pandoc python3 python3-pyyaml texlive-latex texlive-collection-fontsrecommended texlive-fancyhdr texlive-subfigure texlive-wrapfig texlive-import texlive-ulem texlive-xifthen texlive-overpic texlive-pbox tidy ghostscript

Followed by:

pip3 install --user plastex
pip3 install --user plastex nh3

### Arch
Package is available on the AUR [kattis-problemtools-git](https://aur.archlinux.org/packages/kattis-problemtools-git). Use your favorite AUR helper or follow the installation instructions found [here](https://wiki.archlinux.org/title/Arch_User_Repository#Installing_and_upgrading_packages).
Expand Down
1 change: 1 addition & 0 deletions admin/docker/Dockerfile.build
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ RUN apt update && \
libgmp-dev \
libgmp10 \
libgmpxx4ldbl \
pandoc \
python3 \
python3-pytest \
python3-setuptools \
Expand Down
1 change: 1 addition & 0 deletions admin/docker/Dockerfile.full
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ RUN apt-get update && \
mono-complete \
nodejs \
ocaml-nox \
pandoc \
php-cli \
pypy \
rustc \
Expand Down
1 change: 1 addition & 0 deletions admin/docker/Dockerfile.minimal
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ RUN apt update && \
apt install -y \
ghostscript \
libgmpxx4ldbl \
pandoc \
python-pkg-resources \
python3-minimal \
python3-yaml \
Expand Down
2 changes: 1 addition & 1 deletion debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Homepage: https://github.com/Kattis/problemtools

Package: kattis-problemtools
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}, python3, texlive-plain-generic, texlive-fonts-recommended, texlive-latex-extra, texlive-lang-cyrillic, tidy, ghostscript, dvisvgm
Depends: ${shlibs:Depends}, ${misc:Depends}, pandoc, python3, texlive-plain-generic, texlive-fonts-recommended, texlive-latex-extra, texlive-lang-cyrillic, tidy, ghostscript, dvisvgm
Recommends: gcc, g++
Description: Kattis Problem Tools
These are tools to manage and verify problem packages in the
Expand Down
3 changes: 2 additions & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,5 @@ more than one language.
## oddecho

This is an example of a *scoring* problem where submissions can get
different scores depending on which test groups they solve. It also demonstrates how an input validator might check different constraints for different test groups.
different scores depending on which test groups they solve. It also demonstrates how an input validator might check different constraints for different test groups. The swedish statement showcases how to use images, footnotes
and tables in Markdown.
2 changes: 2 additions & 0 deletions examples/different/problem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
## Author of the problem (default: null)
# author:

name: A Different Problem

## Where the problem was first used (default: null)
source: Kattis
# source_url:
Expand Down
1 change: 1 addition & 0 deletions examples/guess/problem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ source: Kattis
license: cc by-sa

validation: custom interactive
name: Guess the Number

# Override standard limits: say that the TLE solutions provided should
# be at least 4 times above the time limit in order for us to be
Expand Down
20 changes: 20 additions & 0 deletions examples/guess/problem_statement/problem.sv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Jag tänker på ett hemligt tal mellan $1$ and $100$, kan du gissa vilket?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess is a legacy problem, which does/should not support markdown. I'd like for our examples to follow the standard.

One way to fix that would be to convert guess and/or oddecho (where you also added a markdown statement) to 2023-07-draft.

Givet en gissning kommer jag att berätta om din gissning
var för stor, för liten eller rätt. Du får bara $10$ gissningar, använd
dem klokt!


## Interaktion
Ditt program ska skriva ut gissningar om talet.
En gissning är en rad som enbart innehåller ett heltal mellan $1$ och $1000$.
Efter varje gissning måste du flusha standard out.

Efter varje gissning kan du läs svaret på standard in.
Detta svar är ett av tre ord:

- `lower` om talet jag tänker på är lägre än din gissning,
- `higher` om talet jag tänker på är högre än din gissning, eller
- `correct` om din gissning är korrekt.

Efter att ha gissat rätt ska du avsluta ditt program.
Om du gissar fel $10$ gånger får du inga fler chanser och ditt program kommer avbrytas.
1 change: 1 addition & 0 deletions examples/hello/problem.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
source: Kattis
license: public domain
name: Hello World!

# Fix memory limit at 512 MB. (Note that for most problems, this
# should not be done. It is only done in this case because we include
Expand Down
2 changes: 1 addition & 1 deletion examples/oddecho/problem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ license: cc by-sa
author: Johan Sannemo
source: Principles of Algorithmic Problem Solving
type: scoring
name: Echo
name: Odd Echo
grading:
show_test_data_groups: true
Binary file added examples/oddecho/problem_statement/cave.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 33 additions & 0 deletions examples/oddecho/problem_statement/problem.sv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
**EKO! Eko! Ek...**

![CC-BY-SA 2.0 By William Craig on wikimedia.org](cave.jpg)

Du älskar att skrika i grottor för att höra dina ord ekade tillbaka till dig. Tyvärr, som en hårt arbetande mjukvaruingenjör, har du
inte tid för att komma ut och skrika i grottor så ofta. Istället skulle du vilja implementera ett program som fungerar som en ersättning för en grotta.

Ibland vill du mata in några ord i programmet och få dem ekade tillbaka till dig. Men, som det är välkänt, om du skriker för snabbt i en grotta kan ekot störa de nya ord du säger. [^1] Mer specifikt, vartannat ord du säger kommer att störa ekot av ditt tidigare ord. Därför kommer endast det första, tredje, femte och så vidare ordet faktiskt att producera ett eko.

Din uppgift är att skriva ett program som simulerar detta beteende.

## Indata

Den första raden av indata innehåller ett heltal $N$ ($1 \le N \le 10$).

De följande $N$ raderna innehåller vardera ett ord. Varje ord är högst $100$ bokstäver långt och innehåller endast bokstäverna `a-z`.

## Utdata

Skriv ut de ord som har udda index (dvs. första, tredje, femte och så vidare) i inmatningen.


## Poängsättning

Din lösning kommer att testas på en mängd testfallsgrupper.
För att få poäng för en grupp så måste du klara alla testfall i gruppen.

| Grupp | Poäng | Begränsningar |
|-------|-------|--------------------------|
| 1 | 1 | $N$ är alltid $5$ |
| 2 | 1 | Inga ytterligare begränsningar |

[^1]: [https://sv.wikipedia.org/wiki/Interferens](https://sv.wikipedia.org/wiki/Interferens)
169 changes: 169 additions & 0 deletions problemtools/md2html.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
#! /usr/bin/env python3
# -*- coding: utf-8 -*-
import argparse
import html
import os
from pathlib import Path
import re
import shutil
import string
import subprocess

import nh3

from . import statement_util


def convert(problem: str, options: argparse.Namespace) -> bool:
"""Convert a Markdown statement to HTML

Args:
problem: path to problem directory
options: command-line arguments. See problem2html.py
"""
problembase = os.path.splitext(os.path.basename(problem))[0]
destfile = string.Template(options.destfile).safe_substitute(problem=problembase)

statement_path = statement_util.find_statement(problem, extension='md', language=options.language)

if statement_path is None:
raise FileNotFoundError('No markdown statement found')

if not os.path.isfile(statement_path):
raise FileNotFoundError(f'Error! {statement_path} does not exist')

command = ['pandoc', statement_path, '-t', 'html', '--mathjax']
statement_html = subprocess.run(command, capture_output=True, text=True, shell=False, check=True).stdout

statement_html = sanitize_html(problem, statement_html)

templatepaths = [
os.path.join(os.path.dirname(__file__), 'templates/markdown_html'),
'/usr/lib/problemtools/templates/markdown_html',
]
templatepath = next(
(p for p in templatepaths if os.path.isdir(p) and os.path.isfile(os.path.join(p, 'default-layout.html'))), None
)

if templatepath is None:
raise FileNotFoundError('Could not find directory with markdown templates')

with open(Path(templatepath) / 'default-layout.html', 'r', encoding='utf-8') as template_file:
template = template_file.read()

problem_name = statement_util.get_yaml_problem_name(problem, options.language)
substitution_params = {
'statement_html': statement_html,
'language': options.language,
'title': html.escape(problem_name) if problem_name else 'Missing problem name',
'problemid': html.escape(problembase),
}

statement_html = template % substitution_params

samples = statement_util.format_samples(problem)
# Insert samples at {{nextsample}} and {{remainingsamples}}
statement_html, remaining_samples = statement_util.inject_samples(statement_html, samples)

# Insert the remaining samples at the bottom
# However, footnotes should be below samples
sample_insertion_position = statement_util.find_footnotes(statement_html)
if sample_insertion_position is None:
# No footnotes, so insert at the end
sample_insertion_position = statement_html.rfind('</body>')
statement_html = (
statement_html[:sample_insertion_position] + ''.join(remaining_samples) + statement_html[sample_insertion_position:]
)

with open(destfile, 'w', encoding='utf-8', errors='xmlcharrefreplace') as output_file:
output_file.write(statement_html)

if options.css:
shutil.copyfile(os.path.join(templatepath, 'problem.css'), 'problem.css')

return True


def sanitize_html(problem: str, statement_html: str):
# Allow footnote ids (the anchor points you jump to)
def is_fn_id(s):
pattern_id_top = r'^fn\d+$'
pattern_id_bottom = r'^fnref\d+$'
return bool(re.fullmatch(pattern_id_top, s)) or bool(re.fullmatch(pattern_id_bottom, s))

allowed_classes = ('sample', 'problemheader', 'problembody', 'sampleinteractionwrite', 'sampleinteractionread')

def is_image_valid(problem_root: str, img_src: str) -> str | None:
# Check that the image exists and uses an allowed extension
extension = Path(img_src).suffix
# TODO: fix svg sanitization and allow svg
if extension not in statement_util.ALLOWED_IMAGE_EXTENSIONS:
return f'Unsupported image extension {extension} for image {img_src}'

source_file = Path(problem_root) / 'statement' / img_src
if not source_file.exists():
return f'Resource file {img_src} not found in statement'
return None

# Annoying: nh3 will ignore exceptions in attribute_filter
image_fail_reason: str | None = None

def attribute_filter(tag, attribute, value):
if attribute == 'class' and value in allowed_classes:
return value
# Never versions of Pandoc will give class="footnotes footnotes-end-of-document"
# We don't want to blindly allow any class with footnotes in it, so only allow footnotes
if attribute == 'class' and 'footnotes' in value:
return 'footnotes'
if tag == 'a' and attribute == 'href':
return value
if tag in ('li', 'a') and attribute == 'id' and is_fn_id(value):
return value
if tag == 'img' and attribute == 'src':
fail = is_image_valid(problem, value)
if fail:
nonlocal image_fail_reason
image_fail_reason = fail
return None
copy_image(problem, value)
return value
return None

statement_html = nh3.clean(
statement_html,
link_rel='noopener nofollow noreferrer',
attribute_filter=attribute_filter,
tags=nh3.ALLOWED_TAGS | {'img', 'a', 'section'},
attributes={
'table': {'class'},
'aside': {'class'},
'div': {'class'},
'section': {'class'},
'img': {'src'},
'a': {'href', 'id'},
'li': {'id'},
},
)

if image_fail_reason:
assert isinstance(image_fail_reason, str)
if 'Unsupported' in image_fail_reason:
raise ValueError(image_fail_reason)
raise FileNotFoundError(image_fail_reason)

return statement_html


def copy_image(problem_root: str, img_src: str) -> None:
"""Copy image to output directory

Args:
problem_root: the root of the problem directory
img_src: the image source as in the Markdown statement
"""

source_name = os.path.join(problem_root, 'statement', img_src)

if os.path.isfile(img_src): # already copied
return
shutil.copyfile(source_name, img_src)
2 changes: 1 addition & 1 deletion problemtools/metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ class Metadata2023_07(BaseModel):

problem_format_version: str
name: dict[str, str] | str
uuid: UUID
uuid: UUID | None = None # UUID *is* mandatory, but we deal with that in verifyproblem for better UX
type: list[ProblemType] | ProblemType = ProblemType.PASS_FAIL
version: str | None = None
credits: dict | str | None = None
Expand Down
Loading