RegExp regexp, memos
Syntax
Regexp | |
---|---|
[abc] |
A single character of: a, b or c |
[^abc] |
Any single character except: a, b, or c |
[a-z] |
Any single character in the range a-z |
[a-zA-Z] |
Any single character in the range a-z or A-Z |
^ |
Start of line |
$ |
End of line |
\A |
Start of string |
\z |
End of string |
. |
Any single character |
\s |
Any whitespace character |
\S |
Any non-whitespace character |
\d |
Any digit |
\D |
Any non-digit |
\w |
Any word character (letter, number, underscore) |
\W |
Any non-word character |
\b |
Any word boundary |
(...) |
Capture everything enclosed |
(a|b) |
a or b |
a? |
Zero or one of a |
a* |
Zero or more of a |
a+ |
One or more of a |
a{3} |
Exactly 3 of a |
a{3,} |
3 or more of a |
a{3,6} |
Between 3 and 6 of a |
aeiou* |
Match “aeio” followed by any number of “u” |
[aeiou]* |
Match any number of vowels |
(aeiou)* |
Match any number of sequences of “aeiou” |
(dog|cat) |
Match either “dog” or “cat” |
Usage
$1
, $2
… are the result of capture by ( )
.
reg = %r=my reg exp=
reg = /my regexp/
"string to test" =~ reg
Quick ways and api
s = " 12 5 "
r = / ([0-9]+)/
p s[/ ([0-9]+)/, 1]
x = s.match(r)
x.pre_match
x.post_match
x.captures # or x.to_a
Back references
pattern = /aa(\d+)-\1/
pattern =~ 'aa1234-1234' # => 0
pattern =~ 'aa1234-1233' # => nil
Greedy
s = "Here another string"
greedy = /[a-z]* [a-z]*/
non_greedy = /[a-z]*? [a-z]*?/
p greedy.match(s)[0] # => "ere another"
p non_greedy.match(s)[0] # => "ere "
r = /<.*>/ # Greedy repetition: matches "<ruby>perl>"
r = /<.*?>/ # Nongreedy: matches "<ruby>" in "<ruby>perl>"
Options
/i case insensitive
/m multiline mode - '.' will match newline
/x extended mode - whitespace is ignored
/o only interpolate #{} blocks once
/[neus] encoding: none, EUC, UTF-8, SJIS, respectively
Links references
- http://rubular.com/
- http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UJ
- http://strugglingwithruby.blogspot.com/2009/05/regular-expressions-in-ruby.html
comments powered by Disqus