Add vl-convert-fontsource crate and explicit font install#245
Draft
Add vl-convert-fontsource crate and explicit font install#245
Conversation
8a3d069 to
060fda9
Compare
New standalone crate for downloading, caching, and resolving font files from the Fontsource catalog (which includes Google Fonts). Provides disk-based LRU cache, variant filtering, and batch font loading into fontdb::Database via a source-batch registration API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add from_parts(), clone_fontdb(), and hinting_enabled() to ResolvedFontConfig so workers can construct resolved configs from existing fontdb databases without rebuilding from FontConfig. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce FontBaselineSnapshot (RwLock) so workers clone a pre-resolved fontdb instead of each re-resolving FontConfig on every version bump. Add install_font() for Fontsource downloads, configure_font_cache() for disk cache sizing, and re-export public types from vl-convert-rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add WorkerFontState and FontRequest types. Workers now clone a shared FontBaselineSnapshot at startup and apply per-request font overlays for request-scoped fonts. SVG/PNG/PDF render commands carry font sources through the command channel so the worker can install them before rendering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Global CLI flags to download Fontsource fonts before conversion. --install-font accepts a family name (repeatable), --install-font-variants accepts comma-separated "weight:style" pairs to filter variants. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Expose install_font(family, variants) in both sync and asyncio APIs. Add font_cache_size_mb parameter to configure_converter(). Update type stubs to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace path-based font cache with blob cache (metadata + blob bytes), rename public API to use explicit "fontsource" naming throughout (register_fontsource_font, FontsourceFontRequest, --fontsource-font), simplify CLI to single --fontsource-font flag, and wire fontsource_fonts field through VgOpts/VlOpts for per-request font loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ories Replace separate metadata_cache_dir and blob_cache_dir fields with a single cache_dir: Option<PathBuf> and metadata_dir()/blob_dir() helpers that derive metadata/ and blobs/ subdirectories. Simplifies config and adds VL_CONVERT_FONT_CACHE_DIR env var support (path override or "none" to disable persistent caching). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
903179c to
116d812
Compare
…n fix - Add with_font_overlay! macro in converter.rs to eliminate 12 copy-pasted apply/clear overlay blocks in handle_command match arms - Replace hand-rolled retry loops with backon crate's ExponentialBuilder for proper exponential backoff on retryable HTTP errors - Extract is_ttf_file() helper in cache.rs and apply to evict_blob_lru_until_size so eviction only considers .ttf files (matching calculate_blob_cache_size_bytes) - Fix pre-existing CI failures: duplicate ..Default::default(), missing fontsource_fonts field in test VlOpts, and formatting in lib.rs/tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The clone is required because the async closure passed to buffer_unordered needs owned ResolvedTtfFile values. Without it, Rust 1.93 raises "implementation of FnOnce is not general enough" due to higher-ranked lifetime requirements. Suppress the clippy redundant_iter_cloned lint for this call. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ycle
- Blob cache keys are now SHA-256 hashes of the download URL instead of
human-readable {font_id}--{subset}-{weight}-{style}.ttf names, making
collisions impossible and removing the filename field from ResolvedTtfFile.
- Blob files use .blob extension instead of .ttf.
- cache_dir is now Option<PathBuf>; set VL_CONVERT_FONT_CACHE_DIR=none to
disable persistent caching entirely. Fall back to no caching (rather than
temp_dir) when dirs::cache_dir() is unavailable.
- Self-heal corrupt metadata (bad JSON) and blob entries (directory at blob
path) instead of silently returning stale data.
- DownloadGate tracks active_users so gates are pruned from the DashMap when
the last consumer releases, preventing unbounded map growth.
- Async ensure_blobs preserves insertion order via index sorting.
- New tests for gate lifecycle, corrupt entry self-healing, and hash-agnostic
eviction verification.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…omments - Extract triple-nested HashMap into pub type VariantMap. - VariantsNotAvailable error now holds Vec<VariantRequest> instead of Vec<String>. - Add Display impl for VariantRequest (formats as "400-normal"). - Remove banner-style section comments in cache.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
read_blob now validates TTF/OTF/TTC magic bytes before returning cached data. Corrupt or truncated files are deleted and treated as cache misses, triggering a re-download. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eature flag Isolate fontdb dependency behind an optional `fontdb` feature flag so the core Fontsource client (API, caching, variant resolution) compiles without fontdb. The fontdb integration (RegisteredFontBatch, FontsourceDatabaseExt) lives in a feature-gated fontdb_ext module. LoadedFontBatch now holds raw Vec<Arc<Vec<u8>>> bytes; consumers construct fontdb::Source at the call site. Also renames FontsourceFontdbError to FontsourceError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sion trait Replace Vec<Arc<Vec<u8>>> with Vec<LoadedFontBatch> in VlConvertCommand variants so the overlay uses register_fontsource_batch/unregister_fontsource_batch directly. This eliminates all manual fontdb::Source::Binary construction from converter.rs — fontdb_ext.rs is now the single fontdb integration seam. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract shared helpers (validate_load_request, try_read_cached_metadata, cache_metadata, parse_metadata_response, try_read_cached_blob, cache_blob) to eliminate duplication between async and blocking code paths. Inline prepare_load_async/prepare_load_blocking into load/load_blocking and delete the redundant PreparedLoad struct. Add brief rustdoc comments to all methods and remove block separator comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…scope Replace manual work-stealing thread pool (shared VecDeque queue, Arc<Mutex> results, AtomicUsize counter, first-error propagation) with files.chunks() partitioning. Threads run concurrently on pre-assigned chunks, results stay in order naturally, errors propagate via ?. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run cargo fmt and bundle-licenses to fix CI failures after the fontsource-fontdb -> fontsource crate rename. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When blob_dir is None, the gate serializes downloads without benefit since waiters would just re-download anyway. Skip directly to the download; max_parallel_downloads still applies at the caller level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add FontsourceError::RelativeCacheDir and return it from FontsourceClient::new when cache_dir is not absolute, matching the convention of XDG_CACHE_HOME, CARGO_HOME, etc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document async vs blocking client initialization, dedupe_variants intent, Drop impl rationale, and ClientConfig field descriptions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verifies that with cache_dir: None, every load re-fetches both metadata and blobs from the network. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove unused ext.rs (duplicate of fontdb_ext.rs), add comment explaining tokio::sync::Mutex in blocking context, and apply fmt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add support for downloading, caching, and registering fonts from the Fontsource catalog (which includes Google Fonts and other open-source font families). Fonts are fetched on demand, cached to disk with LRU eviction, and loaded into
fontdbfor use in Vega/Vega-Lite rendering.Motivation
vl-convert renders charts to static images using server-side text layout, which requires the actual font files. Previously, users had to manually download font files and point vl-convert at the directory. This PR adds automatic font downloading from Fontsource so users can register any of ~1,800 open-source font families by name.
Usage
CLI
Python
Rust
Per-request fonts via
VlOpts/VgOpts:Per request fonts don't remain in memory after the conversion, so are a better solution for long running server processes. This is also the foundation that auto font detection will use in the following PR.
Architecture
New crate:
vl-convert-fontsourceSelf-contained crate with no dependencies on other vl-convert workspace crates. The core library (API client, disk cache, variant resolution) has no
fontdbdependency —fontdbintegration is behind an optionalfontdbfeature flag. This design means the crate could be spun out of the vl-convert workspace as an independent Fontsource client if there is future interest.Core functionality (no
fontdbrequired):/v1/fonts/{id}), cached to{cache_dir}/metadata/{cache_dir}/blobs/fs4)With
fontdbfeature:FontsourceDatabaseExttrait for batch register/unregister onfontdb::DatabaseRegisteredFontBatchfor tracking registered face IDsPublic API:
FontsourceClient,ClientConfig,LoadedFontBatch,VariantRequest,FontStyle,FontsourceError. Withfontdbfeature:FontsourceDatabaseExt,RegisteredFontBatch.Integration: worker font overlay system
The converter worker pool gains a font overlay mechanism:
FontBaselineSnapshot: SharedRwLocksnapshot of the resolvedfontdb::Database, cloned by workers at startup.Vec<LoadedFontBatch>flows through theVlConvertCommandenum. Workers callregister_fontsource_batchto temporarily load request-scoped fonts, thenunregister_fontsource_batchafter rendering. Fonts specified viaVlOpts/VgOptsfontsource_fontsfield are transient.register_fontsource_font()adds fonts to the globalFONT_CONFIGand bumps the baseline version, making them available to all subsequent requests.Other changes in converter.rs
VlConvertCommandvariants (VgToJpeg,VgToPdf,VlToJpeg,VlToPdf,SvgToPng,SvgToJpeg,SvgToPdf) route all conversion paths through the worker command channel. This was necessary so per-request fontsource fonts can be applied consistently before any render.Environment variables
VL_CONVERT_FONT_CACHE_DIR"none"to disable disk caching)VL_CONVERT_FONTSOURCE_API_URLDefault cache location:
{dirs::cache_dir()}/vl-convert/fontsource/(e.g.,~/Library/Caches/vl-convert/fontsource/on macOS,~/.cache/vl-convert/fontsource/on Linux)Testing
Integration test suite in
vl-convert-fontsource/tests/load_fontsource.rswith a customTestServerthat tracks per-endpoint hit counts and max inflight concurrent requests.Coverage includes:
FontsourceDatabaseExt, idempotent unregister, async/blocking parityExisting vl-convert-rs end-to-end tests continue to pass.
Known Limitations
fontdbrequires TTF/OTF. Fontsource provides TTF for all fonts, so no fonts are excluded by this.--fontsource-fontdownloads all available variants. The Python API supports variant selection via thevariantsparameter.