Table Of Contents
- Regular Expressions
Regular Expressions
What is a Regular Expression
- recommended site: https://regexr.com
Searching with Regular Expressions
- 4 primary components
- character classes
- character set |
[ABC] - negated set |
[^ABC] - range |
[A-Z]or[a-z] - word |
\w - digit |
\d - single character |
.- any character except new line
- whitespace |
\s
- character set |
- quantifiers and alternation
+| one or more of the previous token*| zero or more of the previous token{min,max}| range- like
{0,3}
- like
?| optional|| alternation | one token or an other
- roots
- anchors
- allows us to specify where the match starts and where the match ends
^beginning of the string/line (line -- if multi line option is enabled)$| end of the string/line (line -- if multi line option is enabled)- in the example
- it means, that the entire line must match the date
- if there is other text on the line with the date --> the date will not match
- character classes
- examples
- date
^[A-Z][a-z]{2,}\s+[0,3]?[1-9],\s+[12]?[0-9]{0,3}$
- number less-or-equal 42
- case 1 -- digits that start with 4
- case 2 -- double digits
- case 3 -- 3 or more digits
^4[2-9]|[5-9]\d|[1-9]\d{2,}$
- monetary value -- faulty
- escape dollar sign with \
^\$?\s*[1-9][0-9]{0,2}(,?[0-9]{3})*(\.[0-9]{2})?$
- monetary value -- fixed
^\$?\s*[1-9][0-9]{0,2}((,[0-9]{3})*|)(\.[0-9]{2})?$
- date
- examples | using with
grep# won't work properly -- \d is not recognized
grep -E '^4[2-9]|[5-9]\d|[1-9]\d{2,}$' numbers.txt
# replace with it's class
grep -E '^4[2-9]|[5-9][0-9]|[1-9][0-9]{2,}$' numbers.txt
# or
grep -E '^4[2-9]|[5-9][:digit:]|[1-9][:digit:]{2,}$' numbers.txt
Replacing with Regular Expressions | on regexr
- for the whole match |
$& - for partial matches | capturing groups --> add () to the groups you want to match and later on refer to
- var 1 -->
$1 - var2 -->
$2 - replacing day,year,month -->
$3-$1-$2
- var 1 -->
Tips on Building Regular Expressions
- regular expressions are greedy
- add an
?after*or+to make it lazy
- add an
- don't build an expression all at once
- build a piece, then test it --> repeat
- use multiple, simpler expressions
- test with valid and invalid data
- add comments using x modifier