Skip to content

Commit 78b998e

Browse files
committed
Vendor wcwidth using python's unicodedata
This commit replaces wcwidth dependency by a simple vendored module, leveraging python's built-in unicodedata. Notes: 1. `wcwidth()` function, provided by wcwidth library, is already decorated with `lru_cache(100)`. Hence following line wraps lru-cached function into another duplicated lru-cache layer, which may cause significant overhead. wcwidth: Callable[[str], int] = lru_cache(maxsize=4096)(_wcwidth) 2. performance of vendored `wcwidth()` function is more or less equal to that provided by `wcwidth` package. 3. this change turns pyte into a self-contained library. 4. only possible downside is supported unicode version being bound/limited to that of used python interpreter. But that's probably rather minor as the interpreter wouldn't be able to decode more recent unicode chars anyway. Benchmarks: >>> from timeit import timeit >>> from wcwidth import wcswidth as wcswidth1 >>> from pyte.wcwidth import wcswidth2 >>> s = "开源的计算机代数系统 Maxima 是用于操纵符号和数值表达式的系统" >>> timeit(lambda: wcswidth1(s)) 7.851543699999999 >>> timeit(lambda: wcswidth2(s)) 3.857342599999999 Credits: The implementation is borrowed from pytest and slightly tweaked.
1 parent 61d0c0c commit 78b998e

File tree

3 files changed

+66
-6
lines changed

3 files changed

+66
-6
lines changed

pyproject.toml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,6 @@ classifiers = [
4646
"Programming Language :: Python :: 3.13",
4747
"Topic :: Terminals :: Terminal Emulators/X Terminals",
4848
]
49-
dependencies = [
50-
"wcwidth",
51-
]
5249

5350
[project.urls]
5451
Homepage = "https://github.com/selectel/pyte"

pyte/screens.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,17 +38,15 @@
3838
from typing import Any, Dict, List, NamedTuple, Optional, Set, TextIO, TypeVar
3939
from collections.abc import Callable, Generator, Sequence
4040

41-
from wcwidth import wcwidth as _wcwidth # type: ignore[import-untyped]
42-
4341
from . import (
4442
charsets as cs,
4543
control as ctrl,
4644
graphics as g,
4745
modes as mo
4846
)
4947
from .streams import Stream
48+
from .wcwidth import wcwidth
5049

51-
wcwidth: Callable[[str], int] = lru_cache(maxsize=4096)(_wcwidth)
5250

5351
KT = TypeVar("KT")
5452
VT = TypeVar("VT")

pyte/wcwidth.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
from unicodedata import category, east_asian_width, normalize
2+
from functools import lru_cache
3+
4+
5+
@lru_cache(4096)
6+
def wcwidth(c: str) -> int:
7+
"""
8+
Determine how many columns are needed to display a character in a terminal.
9+
10+
:param c:
11+
A character to determine required columns for.
12+
13+
:returns:
14+
-1 if the character is not printable.
15+
0, 1 or 2 for other characters.
16+
"""
17+
o = ord(c)
18+
19+
# ASCII fast path.
20+
if 0x20 <= o < 0x07F:
21+
return 1
22+
23+
# Some Cf/Zp/Zl characters which should be zero-width.
24+
if (
25+
o == 0x0000
26+
or 0x200B <= o <= 0x200F
27+
or 0x2028 <= o <= 0x202E
28+
or 0x2060 <= o <= 0x2063
29+
):
30+
return 0
31+
32+
cat = category(c)
33+
34+
# Control characters.
35+
if cat == "Cc":
36+
return -1
37+
38+
# Combining characters with zero width.
39+
if cat in ("Me", "Mn"):
40+
return 0
41+
42+
# Full/Wide east asian characters.
43+
if east_asian_width(c) in ("F", "W"):
44+
return 2
45+
46+
return 1
47+
48+
49+
def wcswidth(s: str) -> int:
50+
"""
51+
Determine how many columns are needed to display a string in a terminal.
52+
53+
:param s:
54+
String to determine required columns for.
55+
56+
:returns:
57+
-1 if the string contains non-printable characters.
58+
"""
59+
width = 0
60+
for c in normalize("NFC", s):
61+
wc = wcwidth(c)
62+
if wc < 0:
63+
return -1
64+
width += wc
65+
return width

0 commit comments

Comments
 (0)