Tiny React+UIkit website written in typescript that uses parsimmon to parse a regular expression, refa to build a finite automaton out of the expression and viz.js to display said automaton.
The site is hosted on GitHub Pages, and built from source using a GitHub Workflow.
This project was created as part of a master's course in computer science at UIBK.
Entered regular expressions can be used to test a string and support the following syntax (in order of precedence):
<regex> ::= <union> | <intersection> | <concatenation> | <repetition> | <negation> | <group> | <text>
<union> ::= <regex> "|" <regex>
<intersection> ::= <regex> "&" <regex>
<concatenation> ::= <regex> <regex>
<repetition> ::= <regex> ("*" | "+" | "?" | "{" (NUMBER | [NUMBER] "," [NUMBER]) "}")
<negation> ::= "!" <regex>
<group> ::= "(" <regex> ")"
<text> ::= (<range> | <class> | <char> | ".")+
<range> ::= "[" ["^"] <char> "-" <char> "]"
<class> ::= "\" ("d" | "D" | "w" | "W" | "s" | "S")
<char> ::= NON_META | "\" META
NON_META and META characters depend on the context:
- If used in a
<range>, the followingMETAcharacters need to be escaped:\,^,-and] - If used anywhere else, the following
METAcharacters need to be escaped:\,[,(,),{,},[,],|,&,*,+,?,!and.
All other characters are consider NON_META.
There are some small but important differences in the way regular expressions are validated:
- The regular expression
a|b|is not considered valid by this parser, instead use(a|b)?. Same holds fora||bora~b~. - Regular expressions like
[ab^c]orab[c, where^and[are treated like ordinary characters, are considered invalid. Instead use the escaped form, i.e.[ab\^c]orab\[c.
The following characters can always be entered in escaped form:
- horizontal tab:
\t - carriage return:
\r - linefeed:
\n - vertical tab:
\v - form-feed:
\f - backspace:
\b - NUL character:
\0 - arbitrary Unicode character:
\uXXXX, whereXXXXis a char code from0000toFFFF
For a description on the available character classes, have a look at the MDN documentation.