Extension of pyparsing. You can easily build your own languages. ✌️
- PEG
- Regular Expressions
- Parser
- Formal Grammar
- Operating Semantics
1. mixedExpression
2. build languages (see example)
pyparsing
pip install pyparsing_ext
- core: basic token classes
- actions: classes for parsing actions
- expressions: complicated expressions
- utils: some useful tools
Classes::
Tokens:
Wordx: powerful Word
CharsNot: powerful CharsNotIn
PrecededBy: as FollowedBy  (moved to pyparsing)
MeanWhile:
LinenStart:
Actions:
BaseAction: Base Class of Actions
BifixAction: action for bifix operators such as <x,y>
...
How to define an 'Action' class, that is a wrapper of ParseResults
# inherit BaseAction directly
class VarOpAction(BaseAction):
    # for operators with variables
    pass
# inherit a subclass of BaseAction
class IndexOpAction(VarOpAction):
    # x[start:stop]
    names = ('slice', 'index')   # register the names of tokens
    def __init__(self, instring='', loc=0, tokens=[]):
        # add names or handle with tokens advancedly
        super(IndexOpAction, self).__init__(instring, loc, tokens)
        if 'slice' in self:
            slc = tokens.slice
            self.start = slc.start if 'start' in slc else None
            self.stop = slc.stop if 'stop' in slc else None
            self.step = slc.step if 'step' in slc else None
        else:
            self.index = tokens.index
    def eval(self, calculator):
        # define eval method, define the semantics of the token
        if 'slice' in self:
            return slice(self.start.eval(calculator), self.stop.eval(calculator), self.step.eval(calculator))
        else:
            return self.index.eval(calculator)Functions::
keyRange(s)
ordRange(s)
chrRange(s)
CJK # for matching Chinese Japanese Korean
enumeratedItems
delimitedMatrix # delimitedList with two sepsw = Wordx(lambda x: x in {'a', 'b', 'c', 'd'}) # == Word('abcd')
M = delimitedMatrix(w, ch1=' ', ch2=pp.Regex('\n+').leaveWhitespace())
p = M.parseString('a b\n c d')
print(p.asList())
s = '''
[1]hello, world
[2]hello, kitty
'''
print(enumeratedItems().parseString(s))
cjk = ordRange(0x4E00, 0x9FD5)
cjk.parseString('我爱你, I love you') # => ['我爱你']
cjk = ordRanges((0x4E00, 0x9FD5, 0, 256))
cjk.parseString('我爱你 I love you') # => ['我爱你 I love you']
import pyparsing as pp
integer = pp.pyparsing_common.signed_integer
varname = pp.pyparsing_common.identifier
arithOplist = [('-', 1, pp.opAssoc.RIGHT),
    (pp.oneOf('* /'), 2, pp.opAssoc.LEFT),
    (pp.oneOf('+ -'), 2, pp.opAssoc.LEFT)]
def func(EXP):
    return pp.Group('<' + EXP + pp.Suppress(',') + EXP +'>')| pp.Group('||' + EXP + '||') | pp.Group('|' + EXP + '|') | pp.Group(IDEN + '(' + pp.delimitedList(EXP) + ')')
baseExpr = integer | varname
EXP = mixedExpression(baseExpr, func=func, opList=arithOplist)
a = EXP.parseString('5*g(|-3|)+<4,5> + f(6)')
print(a)
# [[[5, '*', ['g', '(', ['|', ['-', 3], '|'], ')']], '+', ['<', 4, 5, '>'], '+', ['f', '(', 6, ')']]]run example1.py for a simple example
output:
Example 1:
|-1| -> ('|', '|')(-(1))
Example 2:
parse source code:
 
x=|-1|;  # absolute value
y=x*2+1;
if x == 1
{z=[3.3_]; # the floor value
}
print "z =", z;
 
result:
z = 3 
see the dictionary of variables:
{'x': Decimal('1'), 'y': Decimal('3'), 'z': 3}In example2.py, we create a programming language, "Small Python".
run example2.py for a complicated example, to parse a text file test.spy
example2.smallpy.cmdline()  # in mode of command line
The following method in base class of actions may lead error! just delete it in the latest version
    # def __getitem__(self, key):
    #     if isinstance(key, int):
    #         return self.tokens[key]
    #     else:
    #         return getattr(self.tokens, key)