Skip to content

Provide ability for students to access clean PDFs of Quarto slides #2

@caeyo

Description

@caeyo

For future versions of this class, I'd like to suggest providing students the ability to access a clean PDF of the Quarto slides that we switched to this semester. I definitely appreciate that this slide format is easier to work with for the professors and allows students to keep up with iterations to the material, but I found it difficult to take notes in my usual form of live annotating slides (which I don't think is a particularly uncommon practice) as Quarto's options for print output are not stellar - see how the Reductions slideset gets mangled below when using the PDF export mode (the junk is present on every slide, making it unusable):

Screenshot 2024-12-11 at 8 09 39 PM

I recently wrote a script using a headless browser and PDF conversion tooling to piece together screenshots of the slideset one slide at a time, and I've provided that code below if you'd like a baseline to work from to solve this issue. I didn't do this as a PR as I didn't want to impose a particular method of integration, and the code is fairly hacky. It requires playwright and img2pdf, both are accessible via pip.

import argparse
import os
from playwright.sync_api import sync_playwright
import img2pdf


def capture_slides_and_create_pdf(url, output_pdf):
    with sync_playwright() as p:
        browser = p.chromium.launch()
        page = browser.new_page()
        page.goto(url)
        
        page.wait_for_selector('.reveal .slides')
        total_slides = page.evaluate('''() => {
            return Reveal.getTotalSlides();
        }''') 

        screenshots = []
        for i in range(total_slides):
            if i == 0:
                page.evaluate(f'Reveal.slide(0)')
            else:
                page.evaluate(f'Reveal.next()')
            
            page.wait_for_timeout(100)
            page.evaluate('''() => {
                const currentSlide = Reveal.getCurrentSlide();
                const fragments = currentSlide.querySelectorAll('.fragment');
                fragments.forEach(fragment => fragment.classList.add('visible'));
            }''')
            page.wait_for_timeout(100)
            
            screenshot_path = f'slide_{i+1}.png'
            page.screenshot(path=screenshot_path, full_page=True)
            screenshots.append(screenshot_path)

        browser.close()
    
    with open(output_pdf, "wb") as f:
        f.write(img2pdf.convert(screenshots))
    
    for screenshot in screenshots:
        os.remove(screenshot)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("input", help = "HTML filename", type = os.path.abspath)
    parser.add_argument("output", help = "PDF filename", type = os.path.abspath)
    args = parser.parse_args()
    capture_slides_and_create_pdf("file://" + args.input, args.output)

Thanks for the enjoyable semester!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions