Ruby regular expression
A regular expression is a special sequence of characters that matches or finds a collection of strings by using patterns with special syntax.
Regular expressions use pre-defined specific characters and a combination ofthese specific characters to form a “regular string”, which is used to express a filtering logic for strings.
Grammar
A regular expression is literally between or between a slash and a %r
pattern between any delimiters after, as follows:
/pattern//pattern/im#Options can be specified%r!/usr/local!
#Regular expressions using delimiters
Example
#!/usr/bin/rubyline1="Cats are smarter than dogs";line2="Dogs also like
meat";if(line1=~/Cats(.*)/)puts"Line1 contains
Cats"endif(line2=~/Cats(.*)/)puts"Line2 contains Dogs"end
The output of the above instance is as follows:
Line1 contains Cats
Regular expression modifier
A regular expression may literally contain an optional modifier to control the matching of all aspects. The modifier is specified after the second slash character, as shown in the example above. The subscript lists possiblemodifiers:
Modifier |
Description |
---|---|
|
Ignore case when matching text. |
|
Perform #{} interpolation only once, and the regular expression is judged the first time. |
|
Ignore spaces and allow blank characters and comments to be placed throughout the expression. |
|
Matches multiple lines to recognize newline characters as normal characters. |
|
Interpret a regular expression as Unicode (UTF-8), EUC, SJIS, or ASCII. If no modifier is specified, the regular expression is considered to be using source encoding. |
Just like a string passing through %Q
like separating, Ruby allows you to use %r
as the beginning of a regular expression, followed by any separator. This is useful when describing a large number of slash charactersthat you do not want to escape.
#Match a single slash character below without
escaping%r\|/\| # Flag Characters can be matched using the following syntax
%r[</(.*)>]i
Regular expression pattern
Except for the control characters, (+?. * ^ $() [] {} | ), all other characters match themselves. You can escape control characters by placing a backslash before them.
The following table lists the regular expression syntax available in Ruby.
Pattern |
Description |
---|---|
|
Matches the beginning of the line. |
|
Matches the end of the line. |
|
Matches any single character except a newline character. It can also match newline characters when using the m option. |
|
Matches any single character in square brackets. |
|
Matches any single character that is not in square brackets. |
|
Matches the previous subexpression zero or more times. |
|
Matches the previous subexpression one or more times. |
|
Matches the previous subexpression zero or once. |
|
Matches the previous subexpression n times. |
|
Matches the previous subexpression n times or more. |
|
Match the previous subexpression at least n to more than m times. |
|
Match an or b. |
|
Group regular expressions and remember to match the text. |
|
Temporarily turn on the I, m, or x options within the regular expression. Ifyou are in parentheses, only the parts within the parentheses are affected. |
|
Temporarily turn off the I, m, or x options within the regular expression. If you are in parentheses, only the parts within the parentheses are affected. |
|
Group regular expressions, but do not remember the matching text. |
|
Temporarily turn on the I, m, or x options in parentheses. |
|
Temporarily turn off the I, m, or x options in parentheses. |
|
Comments. |
|
Use the mode to specify the location. There’s no range. |
|
Specify the location using the negation of the pattern. There’s no range. |
|
Matches a stand-alone pattern without backtracking. |
|
Matches word characters. |
|
Matches non-word characters. |
|
Match white space characters. Equivalent to [tnrf]。 |
|
Matches non-white space characters. |
|
Match the number. Equivalent to [0-9] . |
|
Matches non-numeric. |
|
Matches the beginning of the string. |
|
Matches the end of the string. If a newline character exists, it matches only before the newline character. |
|
Matches the end of the string. |
|
Matches the point where the last match is completed. |
|
Matches the word boundary when outside parentheses and the backspace key (0x08) when inside parentheses. |
|
Matches non-word boundaries. |
|
Match newline characters, carriage returns, tabs, and so on. |
|
Matches the nth grouping subexpression. |
|
If it has already been matched, the nth grouping subexpression is matched. Otherwise, it points to the octal representation of the character encoding. |
Regular expression instance
Character
Example |
Description |
---|---|
/ ruby/ |
Match “ruby” |
¥ |
Matches the Yen symbol. Ruby 1.9 and Ruby 1.8 support multiple characters. |
Character class
Example |
Description |
---|---|
/ [Rr] Uby/ |
Match “Ruby” or “ruby” |
/ rub [ye] / |
Match “ruby” or “rube” |
/ [aeiou] / |
Match any lowercase vowel |
/ [0-9] / |
Match any number with / [0123456789] / same |
/ [a-z] / |
Match any lowercase ASCII letter |
/ [A-Z] / |
Match any uppercase ASCII letter |
/ [a-zA-Z0-9] / |
Match any character in parentheses |
/ [^aeiou] / |
Matches any character that is not a lowercase vowel |
/ [^0-9] / |
Match any non-numeric character |
Special character class
Example |
Description |
---|---|
/./ |
Matches any character except a newline character |
/./m |
In multiline mode, newline characters can also be matched |
/d / |
Matches a number, which is equivalent to |
/D/ |
Matches a non-number, which is equivalent to |
/s / |
Matches a blank character, equivalent to |
/S / |
Matches a non whitespace character, equivalent to |
/w / |
Matches a word character, which is equivalent to |
/W / |
Matches a non-word character, which is equivalent to |
Repetition
Example |
Description |
---|---|
/ ruby?/ |
Match “rub” or “ruby”. Among them, y is dispensable. |
/ ruby*/ |
Matches “rub” plus 0 or more y. |
/ ruby+/ |
Matches “rub” plus one or more y. |
/d {3} / |
It matches exactly three numbers. |
/d {3,} / |
Match 3 or more digits. |
/d {3pm 5} / |
Match 3, 4, or 5 digits. |
Non-greedy repetition
This matches the minimum number of repetitions.
Example |
Description |
---|---|
/<.*>/ |
Greedy repetition: match “< ruby > perl >” |
/<.*?>/ |
Non-greedy repetition: match < ruby > in “< ruby > perl >” |
Group by parentheses
Example |
Description |
---|---|
/Ddcards / |
No grouping: + repeat\ d |
/ (Dd) + / |
Grouping: + repeat\ Dd pair |
/([Rr]uby(, )?)+/ |
Match “Ruby”, “Ruby, ruby, ruby”, etc. |
Reverse reference
This matches the previously matched grouping again.
Example |
Description |
---|---|
|
Match |
|
Single or double quotation mark string. 1 matches the characters matched by the first group,2 matches the characters matched by the second group, and so on. |
Replace
Example |
Description |
---|---|
/ ruby | rube/ |
Match “ruby” or “rube” |
/ rub (y | le) / |
Match “ruby” or “ruble” |
/ ruby (! + |?) / |
“ruby” is followed by one or more! Or with one? |
Anchor
This requires a matching location to be specified.
Example |
Description |
---|---|
/ ^ Ruby/ |
Matches a string or line that begins with “Ruby” |
/ Ruby$/ |
Matches a string or line that ends with “Ruby” |
/ARuby/ |
Matches a string that starts with “Ruby” |
/ RubyZ / |
Matches a string that ends with “Ruby” |
/bRubyb / |
“Ruby” that matches the boundary of a word |
/brubB / |
\ B non-word boundary: matches “rub” in “rube” and “ruby”, but doesnot match separate “rub” |
/ Ruby (? =!) |
If “Ruby” is followed by an exclamation point, it matches “Ruby” |
/ Ruby (?) |
If “Ruby” is not followed by an exclamation point, it matches “Ruby” |
Special syntax of parentheses
Example |
Description |
---|---|
/ R (? # comment) / |
Match “R”. All remaining characters are comments. |
/ R (? I) uby/ |
Case insensitive when matching “uby”. |
/ R / i:uby) / |
Same as above. |
/ rub (?: y | le)) / |
Only grouping, no1 backreference |
Search and replace
sub
, gsub
, and their substitute variables sub`` and ` gsub`` , it is an important string method when using regular expressions.
All of these methods use regular expression patterns to perform search and replace operations. sub
and sub!
. The first appearance of the replacement mode gsub
and gsub!
all occurrences of the replacementmode.
sub
and gsub
returns a new string, keeping the original string unmodified, while sub!
and gsub!
. The strings they call are modified.
Example
#/ Usr/bin/ruby # - * - coding: UTF-8- * - phone="138-3453-1111
#This is a phone number '# Remove Ruby'
Comment for phone=phone. sub! (/#. * $/, ") puts" Phone number:
#{phone} "# Remove characters other than numbers phone=phone. gsub! (/ D/," ") puts" Phone number
: # {phone}“
The output of the above instance is as follows:
Telephone number : 138-3453-1111
Telephone number : 13834531111
Example
#/ Usr/bin/ruby # - * - coding: UTF-8- * - text="rails is rails, Ruby on Rails
A very good Ruby framework "# Change all" rails "to
Rails text. gsub! ("rails", "Rails") # Put all the words "Rails"
Change all to uppercase text. gsub! (/ rails b/, "Rails") puts "# {text}"
The output of the above instance is as follows:
Rails is Rails, Ruby on Rails a very good Ruby framework