A JavaScript/TypeScript library for Hebrew text processing and manipulation. This library provides utilities for working with Hebrew diacritics (niqqud and cantillation marks / te'amim) and converting between Hebrew and Paleo-Hebrew scripts.
- 🔍 Diacritic Identification / Removal: Identify or strip diacritics from Hebrew text.
- 🔄 Script Conversion: Convert Hebrew text to Paleo-Hebrew script
- 📘 TypeScript Support: Full TypeScript definitions included
- ⚡ Zero Dependencies: Lightweight with no external dependencies
- 📦 Multiple Module Formats: Supports CommonJS, ES modules, and UMD
npm install @yehuthi/hebrewOr using pnpm:
pnpm add @yehuthi/hebrewimport * as hebrew from '@yehuthi/hebrew';The library can remove different types of diacritics from Hebrew text:
// Sample Hebrew text with full diacritics
const hebrewText = "בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃";
// Remove all diacritics (niqqud and cantillation)
const stripped = hebrew.remove_diacritics(hebrewText);
// Result: "בראשית ברא אלהים את השמים ואת הארץ׃"
// Remove only niqqud (vowel diacritics)
const withoutNiqqud = hebrew.remove_diacritics(hebrewText, hebrew.char_niqqud);
// Result: "בראש֖ית בר֣א אלה֑ים א֥ת השמ֖ים וא֥ת הארץ׃"
// Remove only cantillation marks
const withoutCantillation = hebrew.remove_diacritics(hebrewText, hebrew.char_taam);
// Result: "בְּרֵאשִׁית בָּרָא אֱלֹהִים אֵת הַשָּׁמַיִם וְאֵת הָאָֽרֶץ׃"Convert modern Hebrew text to Paleo-Hebrew script:
const modernHebrew = "בראשית ברא אלהים את השמים ואת הארץ";
const paleoHebrew = hebrew.to_paleo(modernHebrew);
// Result: "𐤁𐤓𐤀𐤔𐤉𐤕 𐤁𐤓𐤀 𐤀𐤋𐤄𐤉𐤌 𐤀𐤕 𐤄𐤔𐤌𐤉𐤌 𐤅𐤀𐤕 𐤄𐤀𐤓𐤑"Check individual characters for diacritics:
import * as hebrew from '@yehuthi/hebrew';
// Check if a character is a niqqud
hebrew.char_niqqud('ָ'); // true
hebrew.char_niqqud('א'); // false
// Check if a character is a cantillation mark
hebrew.char_taam('֖'); // true
hebrew.char_taam('א'); // false
// Check if a character is any type of diacritic
hebrew.char_diacritic('ָ'); // true
hebrew.char_diacritic('֖'); // true
hebrew.char_diacritic('א'); // falseRemoves diacritics from Hebrew text.
- Parameters:
input: The Hebrew text to processtype: Optional function to determine which diacritics to remove (defaults to removing all diacritics)
- Returns: String with specified diacritics removed
Converts Hebrew text to Paleo-Hebrew script.
- Parameters:
str: Hebrew text to convert
- Returns: Text converted to Paleo-Hebrew script
Checks if a character is a niqqud (vowel diacritic).
Checks if a character is a cantillation mark (ta'am).
Checks if a character is any type of diacritic (niqqud or cantillation).
Checks if a character code represents a niqqud.
Checks if a character code represents a cantillation mark.
Checks if a character code represents any type of diacritic.
Converts a single Hebrew character to Paleo-Hebrew.
Converts a single Hebrew character to Paleo-Hebrew (unchecked version).
Converts a Hebrew character code to Paleo-Hebrew character code.
Converts a Hebrew character code to Paleo-Hebrew character code (unchecked version).
import * as hebrew from '@yehuthi/hebrew';
// Genesis 1:1 with full diacritics
const genesis1_1 = "בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃";
// Remove all diacritics for plain text
const plainText = hebrew.remove_diacritics(genesis1_1);
console.log(plainText); // "בראשית ברא אלהים את השמים ואת הארץ׃"
// Convert to Paleo-Hebrew
const paleoText = hebrew.to_paleo(plainText);
console.log(paleoText); // "𐤁𐤓𐤀𐤔𐤉𐤕 𐤁𐤓𐤀 𐤀𐤋𐤄𐤉𐤌 𐤀𐤕 𐤄𐤔𐤌𐤉𐤌 𐤅𐤀𐤕 𐤄𐤀𐤓𐤑׃"import * as hebrew from '@yehuthi/hebrew';
function analyzeHebrewText(text) {
const chars = [...text];
const niqqudCount = chars.filter(hebrew.char_niqqud).length;
const taamCount = chars.filter(hebrew.char_taam).length;
const totalDiacritics = chars.filter(hebrew.char_diacritic).length;
return {
totalCharacters: chars.length,
niqqudCount,
taamCount,
totalDiacritics,
plainTextRatio: (chars.length - totalDiacritics) / chars.length
};
}
const analysis = analyzeHebrewText("בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים");
console.log(analysis);pnpm buildpnpm testsrc/index.ts- Main library codesrc/index.test.ts- Test suitedist/- Built distribution files (CommonJS, ES modules, UMD)
MIT License - see LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
Hebrew, alphabet, script, letters, phoenician, niqqud, taamim, diacritics, paleo-hebrew, text processing