Abnormal expressions (abnex) is an alternative to regular expressions (regex). This is a Python library but the abnex syntax could be ported to other languages.
-
Regex
([\w\._-]+)@([\w\.]+)
-
Abnex
{[w"._-"]1++}"@"{[w"."]1++}
-
Abnex (spaced)
{[w "._-"]1++} "@" {[w "."]1++}
-
Abnex (expanded)
{ [w "._-"]1++ } "@" { [w "."]1++ }{{{[a-z '_']1++} {[a-z 0-9 '_-.']0++}} '@' {{[a-z 0-9]1++} '.' {[a-z 0-9]1++} {[a-z 0-9 '-_.']0++}} {[a-z 0-9]1++}}.
- It's easier to read, write and understand.
- You can use spaces inside of the expression, you can also "expand" it, i.e. write it over multiple lines and use indention.
- You don't have to use a backslashes all the time
- More logical/common symbols like
!for not,{}for groups,1++,0++,0+for: one or more, zero or more, zero or one. - It's easier to see if a symbol is an actual symbol you are searching for or if it's a regex character, ex:
- Regex:
[\w-]+@[\w-_]+ - Abnex:
[w "-"]1++ "@" [w "-"]1++
- Regex:
- Start of string, or start of line in multi-line pattern
^->->
- End of string, or end of line in multi-line pattern
$-><-
- Start of string
\A->s>
- End of string
\Z-><s
- Word boundary
\b->:
- Not word boundary
\B->!:
- Start of word
\<->w>
- End of word
\>-><w
- Control character
\c->c
- White space
\s->_
- Not white space
\S->!_
- Digit
\d->d
- Not digit
\D->!d
- Word
\w->w
- Not word
\W->!w
- Hexadecimal digit
\x->x
- Octal digit
\o->o
- 0 or more
*->0++
- 1 or more
+->1++
- 0 or 1
?->0+
- Any character except new line (\n)
.->*
- a or b
a|b->"a"|"b"
- Group
(...)->{...}
- Passive (non-capturing) group
(?:...)->{#...}
- Range (a or b or c)
[abc]->['abc']or["a" "b" "c"]
- Not in set
[^...]->[!...]
- Lower case letter from a to Z
[a-q]->[a-z]
- Upper case letter from A to Q
[A-Q]->[A-Q]
- Digit from 0 to 7
[0-7]->[0-7]
What is the recommended way to write abnexes
- Use spaces between characters in character sets:
- Correct:
[w "_-"] - Incorrect:
[w"_-"]
- Correct:
- Put multiple exact characters between the same quotes in character sets:
- Correct:
["abc"] - Incorrect:
["a" "b" "c"], especially incorrect:["a""b""c"]
- Correct:
- Put spaces between groups:
- Correct:
{w} "." {w} - Incorrect:
{w}"."{w}
- Correct:
Match for an email address:
- Regex:
[\w-\._]+@[\w-\.]+
- Abnex (following standards):
{[w "-._"]1++} "@" {[w "-."]1++}
- Abnex (not following standards):
{[w"-._"]1++}"@"{[w"-."]1++}
Abnex has most functions from the re library, but it also has som extra functionality like: last() & contains().
match()->match()findall()->all()split()->split()sub()->replace()subn()->replace_count()search()->first()
holds(): whether or not a string matches an expression (bool).contains(): wheter or not a string contains a match (bool).last(): the last match in a string.