-
Notifications
You must be signed in to change notification settings - Fork 79
Markdown support #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Markdown support #274
Changes from all commits
Commits
Show all changes
61 commits
Select commit
Hold shift + click to select a range
cb3ea10
Add markdown support
Matistjati 868eb39
Added display math
Matistjati 6a01b1c
Add dependencies for markdown
Matistjati 05f6372
Style markdown tables
Matistjati 673773e
Remove temp files
Matistjati 1c64085
Statement fix
Matistjati 48d18c7
Some refactoring
Matistjati 08645f5
Added image support in markdown
Matistjati a6a1933
Added footnote support
Matistjati 7627c58
Code cleanup
Matistjati 1b222ac
md -> html works
Matistjati 712ce3e
Make md styling more constistent with latex
Matistjati 11a2e4c
md->pdf and Reorganize code
Matistjati 480e0ea
Better md->pdf tables
Matistjati e9b3f8e
Interactive samples for pdf
Matistjati ad3e801
Remove bplusa
Matistjati 30d9603
PDF problem name
Matistjati efc5c9e
Add dependencies
Matistjati 762599f
Add problem names
Matistjati 2bba9d4
Added problem name to test hello package
Matistjati cdd1804
Improve security by running pandoc without shell capabilities
Matistjati 194c7b1
Refactoring
Matistjati 554892a
Even more refactoring
Matistjati d8a4c3e
Remove python3-markdown dependency
Matistjati 7390fb8
Add problem id to pdf and small fixes
Matistjati 46a7003
Disable html
Matistjati 770d5da
Change to wikimedia example image
Matistjati 11b6a13
Sanitize image sources
Matistjati bfd4703
Remove SVG dependency
Matistjati d935771
Better markdown styling
Matistjati d55df47
Better sample styling
Matistjati a0b3f9f
Add \nextsample and \remainingsamples
Matistjati cc5f26e
Better pdf error handling
Matistjati 608fe13
Use {{nextsample}} instead of \nextsample
Matistjati c3dc3c9
Relax image checking (implied by global regex on filenames)
Matistjati 6f1698e
Add svg dependency
Matistjati c6b57c8
Start sanitization + apply feedback
Matistjati cfc285c
Better sanitization + lots of tests
Matistjati 5f5d59d
problem_statement -> statement
Matistjati 213f9ac
Better md -> pdf sample rendering
Matistjati d745f6e
Another escape
Matistjati d4e27a2
More careful with images
Matistjati fdde1a4
Make samplexss more focused
Matistjati 3ded4a4
Experimentally reuse normal LaTeX rendering
Matistjati 9134f30
Merge remote-tracking branch 'problemtools/develop' into pandoc
Matistjati 79b5a5d
Use problemtools problem2pdf to handle md -> pdf
Matistjati fcda106
Cleanup
Matistjati 47bda29
librsvg out of focus for this PR
Matistjati 054448e
Ensure nh3
Matistjati ecdb6c4
Remove ghostscript sanitization. If it wasn't used before, it probabl…
Matistjati 690215f
Add nh3 to deb build
Matistjati 77cb2c9
Linting
Matistjati 2e7653f
Add back ghostscript sanitization
Matistjati 51f5539
Remove unnecessary test
Matistjati 898a786
Merged with developed
Matistjati 63fd2e8
Add nh3 as dependency
Matistjati 5f96852
Fix test import path
Matistjati 754f468
Apply ruff formatting
Matistjati 3063cb0
More robust footnote finding
Matistjati a6be656
Don't double-escape HTML in samples
Matistjati 14e0c84
Ghostscript fixes and tests
Matistjati File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,6 +23,7 @@ RUN apt-get update && \ | |
| mono-complete \ | ||
| nodejs \ | ||
| ocaml-nox \ | ||
| pandoc \ | ||
| php-cli \ | ||
| pypy \ | ||
| rustc \ | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| Jag tänker på ett hemligt tal mellan $1$ and $100$, kan du gissa vilket? | ||
| Givet en gissning kommer jag att berätta om din gissning | ||
| var för stor, för liten eller rätt. Du får bara $10$ gissningar, använd | ||
| dem klokt! | ||
|
|
||
|
|
||
| ## Interaktion | ||
| Ditt program ska skriva ut gissningar om talet. | ||
| En gissning är en rad som enbart innehåller ett heltal mellan $1$ och $1000$. | ||
| Efter varje gissning måste du flusha standard out. | ||
|
|
||
| Efter varje gissning kan du läs svaret på standard in. | ||
| Detta svar är ett av tre ord: | ||
|
|
||
| - `lower` om talet jag tänker på är lägre än din gissning, | ||
| - `higher` om talet jag tänker på är högre än din gissning, eller | ||
| - `correct` om din gissning är korrekt. | ||
|
|
||
| Efter att ha gissat rätt ska du avsluta ditt program. | ||
| Om du gissar fel $10$ gånger får du inga fler chanser och ditt program kommer avbrytas. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| **EKO! Eko! Ek...** | ||
|
|
||
|  | ||
|
|
||
| Du älskar att skrika i grottor för att höra dina ord ekade tillbaka till dig. Tyvärr, som en hårt arbetande mjukvaruingenjör, har du | ||
| inte tid för att komma ut och skrika i grottor så ofta. Istället skulle du vilja implementera ett program som fungerar som en ersättning för en grotta. | ||
|
|
||
| Ibland vill du mata in några ord i programmet och få dem ekade tillbaka till dig. Men, som det är välkänt, om du skriker för snabbt i en grotta kan ekot störa de nya ord du säger. [^1] Mer specifikt, vartannat ord du säger kommer att störa ekot av ditt tidigare ord. Därför kommer endast det första, tredje, femte och så vidare ordet faktiskt att producera ett eko. | ||
|
|
||
| Din uppgift är att skriva ett program som simulerar detta beteende. | ||
|
|
||
| ## Indata | ||
|
|
||
| Den första raden av indata innehåller ett heltal $N$ ($1 \le N \le 10$). | ||
|
|
||
| De följande $N$ raderna innehåller vardera ett ord. Varje ord är högst $100$ bokstäver långt och innehåller endast bokstäverna `a-z`. | ||
|
|
||
| ## Utdata | ||
|
|
||
| Skriv ut de ord som har udda index (dvs. första, tredje, femte och så vidare) i inmatningen. | ||
|
|
||
|
|
||
| ## Poängsättning | ||
|
|
||
| Din lösning kommer att testas på en mängd testfallsgrupper. | ||
| För att få poäng för en grupp så måste du klara alla testfall i gruppen. | ||
|
|
||
| | Grupp | Poäng | Begränsningar | | ||
| |-------|-------|--------------------------| | ||
| | 1 | 1 | $N$ är alltid $5$ | | ||
| | 2 | 1 | Inga ytterligare begränsningar | | ||
|
|
||
| [^1]: [https://sv.wikipedia.org/wiki/Interferens](https://sv.wikipedia.org/wiki/Interferens) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| #! /usr/bin/env python3 | ||
| # -*- coding: utf-8 -*- | ||
| import argparse | ||
| import html | ||
| import os | ||
| from pathlib import Path | ||
| import re | ||
| import shutil | ||
| import string | ||
| import subprocess | ||
|
|
||
| import nh3 | ||
|
|
||
| from . import statement_util | ||
|
|
||
|
|
||
| def convert(problem: str, options: argparse.Namespace) -> bool: | ||
| """Convert a Markdown statement to HTML | ||
|
|
||
| Args: | ||
| problem: path to problem directory | ||
| options: command-line arguments. See problem2html.py | ||
| """ | ||
| problembase = os.path.splitext(os.path.basename(problem))[0] | ||
| destfile = string.Template(options.destfile).safe_substitute(problem=problembase) | ||
|
|
||
| statement_path = statement_util.find_statement(problem, extension='md', language=options.language) | ||
|
|
||
| if statement_path is None: | ||
| raise FileNotFoundError('No markdown statement found') | ||
|
|
||
| if not os.path.isfile(statement_path): | ||
| raise FileNotFoundError(f'Error! {statement_path} does not exist') | ||
|
|
||
| command = ['pandoc', statement_path, '-t', 'html', '--mathjax'] | ||
| statement_html = subprocess.run(command, capture_output=True, text=True, shell=False, check=True).stdout | ||
|
|
||
| statement_html = sanitize_html(problem, statement_html) | ||
|
|
||
| templatepaths = [ | ||
| os.path.join(os.path.dirname(__file__), 'templates/markdown_html'), | ||
| '/usr/lib/problemtools/templates/markdown_html', | ||
| ] | ||
| templatepath = next( | ||
| (p for p in templatepaths if os.path.isdir(p) and os.path.isfile(os.path.join(p, 'default-layout.html'))), None | ||
| ) | ||
|
|
||
| if templatepath is None: | ||
| raise FileNotFoundError('Could not find directory with markdown templates') | ||
|
|
||
| with open(Path(templatepath) / 'default-layout.html', 'r', encoding='utf-8') as template_file: | ||
| template = template_file.read() | ||
|
|
||
| problem_name = statement_util.get_yaml_problem_name(problem, options.language) | ||
| substitution_params = { | ||
| 'statement_html': statement_html, | ||
| 'language': options.language, | ||
| 'title': html.escape(problem_name) if problem_name else 'Missing problem name', | ||
| 'problemid': html.escape(problembase), | ||
| } | ||
|
|
||
| statement_html = template % substitution_params | ||
|
|
||
| samples = statement_util.format_samples(problem) | ||
| # Insert samples at {{nextsample}} and {{remainingsamples}} | ||
| statement_html, remaining_samples = statement_util.inject_samples(statement_html, samples) | ||
|
|
||
| # Insert the remaining samples at the bottom | ||
| # However, footnotes should be below samples | ||
| sample_insertion_position = statement_util.find_footnotes(statement_html) | ||
| if sample_insertion_position is None: | ||
| # No footnotes, so insert at the end | ||
| sample_insertion_position = statement_html.rfind('</body>') | ||
| statement_html = ( | ||
| statement_html[:sample_insertion_position] + ''.join(remaining_samples) + statement_html[sample_insertion_position:] | ||
| ) | ||
|
|
||
| with open(destfile, 'w', encoding='utf-8', errors='xmlcharrefreplace') as output_file: | ||
| output_file.write(statement_html) | ||
|
|
||
| if options.css: | ||
| shutil.copyfile(os.path.join(templatepath, 'problem.css'), 'problem.css') | ||
|
|
||
| return True | ||
|
|
||
|
|
||
| def sanitize_html(problem: str, statement_html: str): | ||
| # Allow footnote ids (the anchor points you jump to) | ||
| def is_fn_id(s): | ||
| pattern_id_top = r'^fn\d+$' | ||
| pattern_id_bottom = r'^fnref\d+$' | ||
| return bool(re.fullmatch(pattern_id_top, s)) or bool(re.fullmatch(pattern_id_bottom, s)) | ||
|
|
||
| allowed_classes = ('sample', 'problemheader', 'problembody', 'sampleinteractionwrite', 'sampleinteractionread') | ||
|
|
||
| def is_image_valid(problem_root: str, img_src: str) -> str | None: | ||
| # Check that the image exists and uses an allowed extension | ||
| extension = Path(img_src).suffix | ||
| # TODO: fix svg sanitization and allow svg | ||
| if extension not in statement_util.ALLOWED_IMAGE_EXTENSIONS: | ||
| return f'Unsupported image extension {extension} for image {img_src}' | ||
|
|
||
| source_file = Path(problem_root) / 'statement' / img_src | ||
| if not source_file.exists(): | ||
| return f'Resource file {img_src} not found in statement' | ||
| return None | ||
|
|
||
| # Annoying: nh3 will ignore exceptions in attribute_filter | ||
| image_fail_reason: str | None = None | ||
|
|
||
| def attribute_filter(tag, attribute, value): | ||
| if attribute == 'class' and value in allowed_classes: | ||
| return value | ||
| # Never versions of Pandoc will give class="footnotes footnotes-end-of-document" | ||
| # We don't want to blindly allow any class with footnotes in it, so only allow footnotes | ||
| if attribute == 'class' and 'footnotes' in value: | ||
| return 'footnotes' | ||
| if tag == 'a' and attribute == 'href': | ||
| return value | ||
| if tag in ('li', 'a') and attribute == 'id' and is_fn_id(value): | ||
| return value | ||
| if tag == 'img' and attribute == 'src': | ||
| fail = is_image_valid(problem, value) | ||
| if fail: | ||
| nonlocal image_fail_reason | ||
| image_fail_reason = fail | ||
| return None | ||
| copy_image(problem, value) | ||
| return value | ||
| return None | ||
|
|
||
| statement_html = nh3.clean( | ||
| statement_html, | ||
| link_rel='noopener nofollow noreferrer', | ||
| attribute_filter=attribute_filter, | ||
| tags=nh3.ALLOWED_TAGS | {'img', 'a', 'section'}, | ||
| attributes={ | ||
| 'table': {'class'}, | ||
| 'aside': {'class'}, | ||
| 'div': {'class'}, | ||
| 'section': {'class'}, | ||
| 'img': {'src'}, | ||
| 'a': {'href', 'id'}, | ||
| 'li': {'id'}, | ||
| }, | ||
| ) | ||
|
|
||
| if image_fail_reason: | ||
| assert isinstance(image_fail_reason, str) | ||
| if 'Unsupported' in image_fail_reason: | ||
| raise ValueError(image_fail_reason) | ||
| raise FileNotFoundError(image_fail_reason) | ||
|
|
||
| return statement_html | ||
|
|
||
|
|
||
| def copy_image(problem_root: str, img_src: str) -> None: | ||
| """Copy image to output directory | ||
|
|
||
| Args: | ||
| problem_root: the root of the problem directory | ||
| img_src: the image source as in the Markdown statement | ||
| """ | ||
|
|
||
| source_name = os.path.join(problem_root, 'statement', img_src) | ||
|
|
||
| if os.path.isfile(img_src): # already copied | ||
| return | ||
| shutil.copyfile(source_name, img_src) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guess is a legacy problem, which does/should not support markdown. I'd like for our examples to follow the standard.
One way to fix that would be to convert guess and/or oddecho (where you also added a markdown statement) to 2023-07-draft.