RegExp regexp, memos

Syntax

Regexp  
   
[abc] A single character of: a, b or c
[^abc] Any single character except: a, b, or c
[a-z] Any single character in the range a-z
[a-zA-Z] Any single character in the range a-z or A-Z
^ Start of line
$ End of line
\A Start of string
\z End of string
. Any single character
\s Any whitespace character
\S Any non-whitespace character
\d Any digit
\D Any non-digit
\w Any word character (letter, number, underscore)
\W Any non-word character
\b Any word boundary
(...) Capture everything enclosed
(a|b) a or b
a? Zero or one of a
a* Zero or more of a
a+ One or more of a
a{3} Exactly 3 of a
a{3,} 3 or more of a
a{3,6} Between 3 and 6 of a
aeiou* Match “aeio” followed by any number of “u”
[aeiou]* Match any number of vowels
(aeiou)* Match any number of sequences of “aeiou”
(dog|cat) Match either “dog” or “cat”
   

Usage

$1, $2… are the result of capture by ( ).

    reg = %r=my reg exp=
    reg = /my regexp/

    "string to test" =~ reg
    

Quick ways and api

    s = " 12 5 "
    r = / ([0-9]+)/
    p s[/ ([0-9]+)/, 1]
    x = s.match(r)
    x.pre_match
    x.post_match
    x.captures # or x.to_a
    

Back references

    pattern = /aa(\d+)-\1/
    pattern =~ 'aa1234-1234' # => 0
    pattern =~ 'aa1234-1233' # => nil
    

Greedy

    s = "Here another string"
    greedy = /[a-z]* [a-z]*/
    non_greedy = /[a-z]*? [a-z]*?/
    p greedy.match(s)[0]     # => "ere another"
    p non_greedy.match(s)[0] # => "ere "

    r = /<.*>/    # Greedy repetition: matches "<ruby>perl>"
    r = /<.*?>/   # Nongreedy: matches "<ruby>" in "<ruby>perl>"
    

Options

    /i         case insensitive
    /m         multiline mode - '.' will match newline
    /x         extended mode - whitespace is ignored
    /o         only interpolate #{} blocks once
    /[neus]    encoding: none, EUC, UTF-8, SJIS, respectively
    

comments powered by Disqus