Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ GeometryBasics = "5c1252a2-5f33-56bf-86c9-59e7332b4326"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
REPL = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
RelocatableFolders = "05181044-ff0b-4ac5-8273-598c1e38db00"
UnicodeFun = "1cfade01-22cf-5700-b092-accc4b62d6e1"
UnPack = "3a884ed6-31ef-47d7-9d2a-63182c4928ed"

[compat]
AbstractTrees = "0.3, 0.4"
Expand All @@ -22,5 +22,5 @@ FreeTypeAbstraction = "0.10"
GeometryBasics = "0.4.1, 0.5"
LaTeXStrings = "1.2"
RelocatableFolders = "0.1, 0.2, 0.3, 1"
UnicodeFun = "0.4"
UnPack = "1.0.2"
julia = "1.6"
21 changes: 18 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,13 @@ This is a package aimed at providing a pure Julia engine for LaTeX math mode. It

# Fonts

The characters in a math expression come from a variety of fonts depending on their role (most notably italic for variable, regular for functions, math for the symbols).
A set of such font forms a `FontFamily`, and several are predefined here (NewComputerModer, TeXGyreHeros, TeXGyrePagella, and LucioleMath)
When a string is parsed, a character can occur as a text character or within
a math expression, and a font has to be chosen accordingly.
The characters in a text expression come from a variety of fonts,
depending on whether they should be rendered upright or italic or with some other style.
For characters in mathematical expressions usually a single [mathematical font](https://en.wikipedia.org/wiki/OpenType#Math) is used,
and stylized glyphs are chosen from the [Unicode maths blocks](https://en.wikipedia.org/wiki/Mathematical_operators_and_symbols_in_Unicode).
The set of all fonts forms a `FontFamily`, and several are predefined here (NewComputerModer, TeXGyreHeros, TeXGyrePagella, and LucioleMath)
and can be access by `FontFamily(name)`.

A font family is defined by a dictionary of the paths of the font files:
Expand Down Expand Up @@ -173,7 +178,9 @@ The table below contains the list of all supported LaTeX construction and their
| Group | `{ }` | `:group` | `elements...` |
| Inline math | `$ $` | `:inline_math` | `content` |
| Integral | `\int_a^b` | `:integral` | `symbol, low_bound, high_bound` |
| Math fonts | `\mathrm{}` | `:font` | `font_modifier, expr` |
| Math glyph substitution[^1] | `\symit{}` | `:sym` | `font_modifier, expr` |
| Text fonts | `\textit{}` | `:text` | `font_modifier, expr` |
| Math fonts[^2] | `\mathrm{}` | `:mathfont` | `font_modifier, expr` |
| Punctuation | `!` | `:punctuation` |
| Simple delimiter | `(` | `:delimiter` |
| Square root | `\sqrt{2}` | `:sqrt` | `content` |
Expand All @@ -182,6 +189,14 @@ The table below contains the list of all supported LaTeX construction and their
| Subscript and superscript | `x_0^2` | `:decorated` | `core, subscript, superscript` |
| Symbol with script under and/or over it | `\sum_i^k` | `:underover` | `symbol, under, over` |

[^1]: By default, glyph substitutions loosely imitate the `math-style=TeX` setting of the LaTeX package [`unicode-math`](https://ctan.org/pkg/unicode-math?lang=en).
The `\symXX` commands are controlled by the fields `unicode_math_substitutions` and `unicode_math_aliases` of a `FontFamily`.
The `FontFamily` constructor accepts the `unicode_math_config` keyword argument to switch to other predefined styling conventions, e.g. `FontFamily(fonts; unicode_math_config=MathTeXEngine.UCMConfig(; math_style_spec=:iso))`.

[^2]: The behavior of the `:mathfont` command can be controlled with the `mathfont_command_mapping` field of a `FontFamily`.
By default, these commands are **not** font switches, but alias the corresponding `:sym` commands, so that glyphs are substituted and the `:math` font is used.
To have `\mathbf` act like `\textbf`, add an entry `:bf => (:text, :bf)` to the `mathfont_command_mapping` dict of the current font family.

## Parser examples

### Basic examples
Expand Down
6 changes: 5 additions & 1 deletion src/MathTeXEngine.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@ using AbstractTrees
using Automa
using FreeTypeAbstraction
using LaTeXStrings
using UnicodeFun

include("UnicodeMath/src/UnicodeMath.jl")
import .UnicodeMath as UCM
import .UnicodeMath: UCMConfig

import UnPack: @unpack
using DataStructures: Stack
using GeometryBasics: Point2f, Rect2f
using REPL.REPLCompletions: latex_symbols
Expand Down
21 changes: 21 additions & 0 deletions src/UnicodeMath/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 Manuel B. Berkemeier <mmanberk@protonmail.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
141 changes: 141 additions & 0 deletions src/UnicodeMath/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# UnicodeMath

[![Build Status](https://github.com/manuelbb-upb/UnicodeMath.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/manuelbb-upb/UnicodeMath.jl/actions/workflows/CI.yml?query=branch%3Amain)

A small Julia package inspired by the great LaTeX package
[`unicode-math`](https://ctan.org/pkg/unicode-math?lang=en) that is available under
[the LaTeX Project Public License 1.3c](https://ctan.org/license/lppl1.3c).

This project is not affiliated to `unicode-math`, the authors of `unicode-math` are
not responsible for this code, and do not offer support.

## About
This module offers configurable Unicode glyph substitutions for Julia `Char`s or `AbstractString`s.
Specifically, the commands
```
symup # upright shape
symit # italic/slanted shape
symbfup # bold upright
symbfit # bold italic
symsfup # sans-serif upright
symsfit # sans-serif italic
symbfsfup # bold sans-serif upright
symtt # mono spaced
symbb # blackboard
symbbit # blackboard italic
symcal # caligraphic
symbfcal # bold caligraphic
symfrak # frakture
symbffrak # bold frakture
```
take as input a `Char` and, if applicable, return the correspondingly styled `Char`, e.g., as defined in the
[Alphanumeric Symbols Unicode block](https://en.wikipedia.org/wiki/Mathematical_operators_and_symbols_in_Unicode#Mathematical_Alphanumeric_Symbols_block).
These commands have direct equivalents in the LaTeX package `unicode-math` and are similar to commands in
[`UnicodeFun.jl`](https://github.com/SimonDanisch/UnicodeFun.jl):
```
to_blackboardbold
to_boldface
to_italic
to_caligraphic
to_frakture
to_latex
```

If there is no symbol defined for the requested style, the input `Char` is returned.
Input strings are parsed one `Char` at a time.

Internally, we define **alphabets**, e.g.
```
:latin # lower case latin letters
:Latin # upper case latin letters
:greek # lower case greek letters
:Greek # upper case greek letters
:num # digits 0-9
```
Not every style is available for every character of every alphabet.
Please refer to the `unicode-math` manual for details, especially Table 7, or check the source file `apply_style.jl`.

Besides the direct substitution commands, there are also
```
_sym # normalization
symbf # bold
symsf # sans-serif
symbfsf # bold sans-serif
```
These commands depend on a style configuration.
For example, `symbf` could return bold upright glyphs (`:bold_style=:upright`),
bold italic glyphs (`:bold_style=:italic`) or bold glyphs for which the shape is
chosen according to the input shape (`:bold_style=:literal`).

The configuration is expressed by an `UCMConfig` object and is hierarchical.
The `math_style_spec` value (`:tex` (default), `:iso`, `:french`, `:upright`, `:literal`) induces
preconfigured values for `normal_style_spec`, `bold_style_spec`, `sans_style` as well as styling
information for the `:nabla` and `:partial` glyphs.
Both `normal_style_spec` and `bold_style_spec` in turn define a `normal_style` or `bold_style`
values for the latin and greek alphabets.
The defaults can be overwritten with the corresponding keyword arguments of the `UCMConfig`
constructor.
The following values are valid:
* `math_style_spec`: `:tex, :iso, :french, :upright, :literal`
* `normal_style_spec`: `:iso, :tex, :french, :upright, :literal`
or a `NamedTuple` with fields `:Greek, :greek, :Latin, :latin` and values `:upright, :italic, :literal`.
* `bold_style_spec`: `:iso, :tex, :upright, :literal`
or a `NamedTuple` with fields `:Greek, :greek, :Latin, :latin` and values `:upright, :italic, :literal`.
* `sans_style`, `partial`, `nabla`: `:upright, :italic, :literal`

For `_sym` and the `sym` commands, a global configuration is set via `global_config!(cfg)`
or `global_config!(; kwargs...)`.
Alternatively, the lower level `apply_style` function can be called with configuration directly, as shown in the examples.

## Examples

### Basic Formatting
Use `apply_style` to format a `Char` or an `AbstractString` according to some configuration:
```julia-repl
julia> import UnicodeMath as UCM
julia> src = "BX 𝐵𝑋 ∇ 𝛁 𝜕 𝝏 𝜶𝜷 αβ 𝚪𝚵 𝜵 az 𝑎𝑧 𝛤𝛯 𝛻 ∂ 𝛛 ΓΞ 𝛼𝛽 1 𝜞𝜩 𝛂𝛃"
julia> cfg_tex = UCM.UCMConfig(; math_style_spec=:tex)
julia> UCM.apply_style(src, cfg_tex)
"𝐵𝑋 𝐵𝑋 ∇ 𝛁 𝜕 𝝏 𝜶𝜷 𝛼𝛽 𝚪𝚵 𝛁 𝑎𝑧 𝑎𝑧 ΓΞ ∇ 𝜕 𝝏 ΓΞ 𝛼𝛽 1 𝚪𝚵 𝜶𝜷"
```

The same keyword arguments that define `UCMConfig` can be given to apply style directly:
```julia-repl
julia> UCM.apply_style(src; math_style_spec=:iso)
"𝐵𝑋 𝐵𝑋 ∇ 𝛁 𝜕 𝝏 𝜶𝜷 𝛼𝛽 𝜞𝜩 𝛁 𝑎𝑧 𝑎𝑧 𝛤𝛯 ∇ 𝜕 𝝏 𝛤𝛯 𝛼𝛽 1 𝜞𝜩 𝜶𝜷"
julia> UCM.apply_style(src; math_style_spec=:upright)
"BX BX ∇ 𝛁 ∂ 𝛛 𝛂𝛃 αβ 𝚪𝚵 𝛁 az az ΓΞ ∇ ∂ 𝛛 ΓΞ αβ 1 𝚪𝚵 𝛂𝛃"
julia> UCM.apply_style(src; math_style_spec=:french)
"BX BX ∇ 𝛁 ∂ 𝛛 𝛂𝛃 αβ 𝚪𝚵 𝛁 𝑎𝑧 𝑎𝑧 ΓΞ ∇ ∂ 𝛛 ΓΞ αβ 1 𝚪𝚵 𝛂𝛃"
```

### Target Style

A target style can be forced.
```julia-repl
julia> UCM.apply_style(src, :bfup; math_style_spec=:iso)
"𝐁𝐗 𝐁𝐗 𝛁 𝛁 𝛛 𝛛 𝛂𝛃 𝛂𝛃 𝚪𝚵 𝛁 𝐚𝐳 𝐚𝐳 𝚪𝚵 𝛁 𝛛 𝛛 𝚪𝚵 𝛂𝛃 𝟏 𝚪𝚵 𝛂𝛃"
```

The styles `:bf`, `:sf` and `:bfsf` still depend on the configuration:
```julia-repl
julia> UCM.apply_style(src, :bf; math_style_spec=:iso)
"𝑩𝑿 𝑩𝑿 𝛁 𝛁 𝝏 𝝏 𝜶𝜷 𝜶𝜷 𝚪𝚵 𝜵 𝒂𝒛 𝒂𝒛 𝜞𝜩 𝛁 𝝏 𝛛 𝜞𝜩 𝜶𝜷 𝟏 𝜞𝜩 𝛂𝛃"
```
In this example, bold glyphs have not been changed, otherwise bold italic glyphs for latin and greek letters are used.

### Global Commands
Apply default styling (`math_style_spec=:tex`), i.e., italic regular-weight letters, except for uppercase Greek letters, which are printed upright, and upright bold-weight letters, except for lowercase greek, which are printed slanted:
```julia-repl
julia> UCM._sym(src)
"𝐵𝑋 𝐵𝑋 ∇ 𝛁 𝜕 𝝏 𝜶𝜷 𝛼𝛽 𝚪𝚵 𝛁 𝑎𝑧 𝑎𝑧 ΓΞ ∇ 𝜕 𝝏 ΓΞ 𝛼𝛽 1 𝚪𝚵 𝜶𝜷"
```
Change the configuration:
```julia-repl
julia> UCM.global_config!(;normal_style_spec=:upright)
```
Now regular-weight letters are all upright:
```julia-repl
julia> UCM._sym(src)
"BX BX ∇ 𝛁 𝜕 𝝏 𝜶𝜷 αβ 𝚪𝚵 𝛁 az az ΓΞ ∇ 𝜕 𝝏 ΓΞ αβ 1 𝚪𝚵 𝜶𝜷"
```
Loading