A Python library for parsing and generating strings from JSGF (Java Speech Grammar Format) grammars. This modernized version supports Python 3.7+ and includes comprehensive testing.
- Parser: Convert JSGF grammar files into abstract syntax trees
- Deterministic Generator: Generate all possible strings from non-recursive grammars
- Probabilistic Generator: Generate random strings using weights and probabilities
- Modern Python: Full Python 3.7+ support with type hints and proper packaging
- Comprehensive Testing: Full test suite with pytest
pip install jsgf-toolsgit clone https://github.com/syntactic/JSGFTools.git
cd JSGFTools
pip install -e .git clone https://github.com/syntactic/JSGFTools.git
cd JSGFTools
pip install -e ".[dev]"Generate all possible strings from a non-recursive grammar:
python DeterministicGenerator.py IdeasNonRecursive.gramGenerate 20 random strings from a grammar (supports recursive rules):
python ProbabilisticGenerator.py Ideas.gram 20import JSGFParser as parser
import DeterministicGenerator as det_gen
import ProbabilisticGenerator as prob_gen
from io import StringIO
# Parse a grammar
grammar_text = """
public <greeting> = hello | hi;
public <target> = world | there;
public <start> = <greeting> <target>;
"""
with StringIO(grammar_text) as f:
grammar = parser.getGrammarObject(f)
# Generate all possibilities (deterministic)
det_gen.grammar = grammar
rule = grammar.publicRules[2] # <start> rule
all_strings = det_gen.processRHS(rule.rhs)
print("All possible strings:", all_strings)
# Generate random string (probabilistic)
prob_gen.grammar = grammar
random_string = prob_gen.processRHS(rule.rhs)
print("Random string:", random_string)JSGFTools supports most of the JSGF specification:
// Comments are supported
public <start> = <greeting> <target>;
// Alternatives with optional weights
<greeting> = /5/ hello | /1/ hi | hey;
// Optional elements
<polite> = [ please ];
// Nonterminal references
<target> = world | there;
// Recursive rules (use with ProbabilisticGenerator only)
<recursive> = base | <recursive> more;
- Rule definitions and nonterminal references
- Alternatives (|) with optional weights (/weight/)
- Optional elements ([...])
- Grouping with parentheses
- Comments (// and /* */)
- Public and private rules
- Unicode support for 10+ major language scripts
JSGFTools fully supports Unicode characters in both tokens and rule names, covering:
- Latin scripts (English, Spanish, French, etc.)
- CJK (Chinese, Japanese Kanji, Korean Hanja)
- Arabic (Arabic, Persian, Urdu)
- Cyrillic (Russian, Ukrainian, Bulgarian)
- Devanagari (Hindi, Sanskrit, Marathi)
- Hangul (Korean)
- Hebrew
- Greek
- Thai
Example:
public <greeting> = hello | 你好 | こんにちは | مرحبا | привет | שלום;
public <问候> = 您好 | 欢迎;
- Kleene operators (* and +)
- Import statements
- Tags
- DeterministicGenerator: Only use with non-recursive grammars to avoid infinite loops
- ProbabilisticGenerator: Can safely handle recursive grammars through probabilistic termination
Example of recursive rule:
<sentence> = <noun> <verb> | <sentence> and <sentence>;
Run the test suite:
pytest test_jsgf_tools.py -vRun specific test categories:
pytest test_jsgf_tools.py::TestJSGFParser -v # Parser tests
pytest test_jsgf_tools.py::TestIntegration -v # Integration testsFor detailed API documentation, build the Sphinx docs:
cd docs
make htmlThen open docs/_build/html/index.html in your browser.
Ideas.gram: Recursive grammar example (use with ProbabilisticGenerator)IdeasNonRecursive.gram: Non-recursive grammar example (use with DeterministicGenerator)
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Run the test suite:
pytest - Submit a pull request
MIT License. See LICENSE file for details.
- 2.1.1: Fixed argparse support in DeterministicGenerator CLI (--help now works)
- 2.1.0: Added comprehensive Unicode support (10+ language scripts), published to PyPI
- 2.0.0: Complete Python 3 modernization, added test suite, improved packaging
- 1.x: Original Python 2.7 version