feat: add Canada SIN + international patterns (v1.5.0)#26
feat: add Canada SIN + international patterns (v1.5.0)#26FelipeMorandini merged 1 commit intomainfrom
Conversation
Add Canadian SIN (XXX-XXX-XXX with Luhn validation, first digit 1-7/9), E.164 international phone numbers (+ country code + 8-15 digits), and SWIFT/BIC codes (8 or 11 chars with ISO 3166-1 country validation). E.164 placed after country-specific phone patterns to avoid overlap. 24 built-in patterns total. 559 tests including 28 new.
There was a problem hiding this comment.
Pull request overview
Adds new built-in redaction patterns/validators (Canada SIN, E.164 international phones, SWIFT/BIC) and bumps the library version to v1.5.0, updating tests and docs accordingly.
Changes:
- Add SIN/E.164/SWIFT built-in
PatternEntrys with validators and partial maskers, and wire them into built-in pattern ordering. - Expand unit tests to cover the new validators/pattern behavior and update built-in pattern count/order assertions.
- Bump package version to
1.5.0and update README/ROADMAP documentation.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/hushlog/_patterns.py |
Introduces SIN/E.164/SWIFT patterns, validators, partial maskers, and updates built-in ordering. |
tests/test_patterns.py |
Adds validator + pattern tests for SIN/E.164/SWIFT and updates built-in count/order expectations. |
tests/test_registry.py |
Updates registry size assertions to reflect 24 built-ins. |
src/hushlog/__init__.py |
Bumps __version__ to 1.5.0. |
tests/integration/test_logging_pipeline.py |
Updates version assertion to 1.5.0. |
pyproject.toml |
Bumps project version to 1.5.0. |
uv.lock |
Updates locked package version to 1.5.0. |
README.md |
Documents new patterns and updates phone coverage statement. |
ROADMAP.md |
Marks SIN/E.164/SWIFT items as completed and clarifies passport deferral. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Show + and first 2 digits (country code approximation) | ||
| cc = "".join(digits[:2]) | ||
| return f"+{cc} {mc * 4} {mc * 4} {last4}" |
There was a problem hiding this comment.
_partial_mask_e164() always exposes the first 2 digits after + as the “country code”. For 1-digit (e.g. +1…) or 3-digit country codes, this leaks a subscriber digit or mis-identifies the country code, which defeats the intent of only preserving the country code. Consider capturing the country-code part in _E164_PHONE_RE (e.g., a named group) and using that exact group in the partial masker, or otherwise deriving the country code from the matched prefix rather than digits[:2].
| # Show + and first 2 digits (country code approximation) | |
| cc = "".join(digits[:2]) | |
| return f"+{cc} {mc * 4} {mc * 4} {last4}" | |
| # Mask all digits except the last 4 to avoid leaking country/subscriber digits | |
| return f"+{mc * 2} {mc * 4} {mc * 4} {last4}" |
There was a problem hiding this comment.
Valid. The partial mask approximates country code as first 2 digits. For +1 (US/Canada) this shows +1X which includes the first subscriber digit. Acceptable trade-off for simplicity — the exact country code length varies (1-3 digits) and detecting it properly requires a country code table.
|
|
||
|
|
||
| def _swift_validate(text: str) -> bool: | ||
| """Validate a SWIFT/BIC code: chars 5-6 must be a valid ISO 3166-1 country code.""" |
There was a problem hiding this comment.
_SWIFT_COUNTRY_CODES includes "XK" while the surrounding comments/docstring state ISO 3166-1 alpha-2 validation. XK is commonly used for Kosovo but is not an official ISO 3166-1 code; either remove it to be strictly ISO-compliant or update the documentation to explicitly allow this non-ISO exception.
| """Validate a SWIFT/BIC code: chars 5-6 must be a valid ISO 3166-1 country code.""" | |
| """Validate a SWIFT/BIC code: chars 5-6 must be a valid ISO 3166-1 alpha-2 country code (plus 'XK' for Kosovo).""" |
There was a problem hiding this comment.
Correct — XK (Kosovo) is a user-assigned code, not officially ISO 3166-1. However, it's widely used in practice (SWIFT, EU, UN) for Kosovo operations. Including it is the pragmatic choice for a PII detection library.
Summary
XXX-XXX-XXX) — Luhn validated, first digit 1-7 or 9+CC XXXXXXXXX) — international format, 8-15 digits, validator for lengthAAAABBCCXXX) — 8 or 11 chars, ISO 3166-1 country code validationTest plan