Perl regular expression


Release date:2023-10-20 Update date:2023-10-21 Editor:admin View counts:351

Label:

Perl regular expression

Regular expression (regular expression) describes a pattern of string matching, which can be used to check whether a string contains a certain substring, replace a matching substring, or extract a substring from a string that meets a certain condition.

Perl regular expression function of the language is very powerful, which is basically the most powerful among the commonly used languages, and many languages can refer to it when designing regular expressions. ``Perl``gets or sets the regular expression.

The Perl three forms of regular expressions are matching, substitution, and transformation

  • Match: m// (it can also be abbreviated as / /, omitting m)

  • Replace: s///

  • Conversion: tr///

These three forms are generally the same as =~ or !~ collocation use =~ indicates a match !~ indicates a mismatch.

Matching operator

Matching operator m// used to match a string statement or a regular expression, for example, to match scalars $bar “run” in, the code is as follows:

Example

#!/usr/bin/perl$bar="I am runoob site. welcome to runoob
site.";if($bar=~/run/){print"First match\\n";
}else{print"First Mismatch\\n";}$bar="run";if($bar=~/run/){print"Second match\\n";}
else{print"Second Mismatch\\n";}

Execute the above program, and the output is as follows:

First match
Second match

Pattern matching modifier

Pattern matching has some common modifiers, as shown in the following table:

Modifier

Description

I

Ignore case in mode

M

Multiline mode

O

Assign a value only once

S

Single line mode, “.”match”n” (default mismatch)

X

Ignore whitespace in the pattern

G

Global matching

Cg

After a global match fails, the matching string is allowed to be found again.

Regular expression variable

After perl processing, there are three special variable names for the matched values:

  • $`: the string that matches the first part of the part

  • $&: matching string

  • $’: there is no remaining string that matches

If you put these three variables together, you will get the original string.

Examples are as follows:

Example

#!/usr/bin/perl$string="welcome to runoob
site.";$string=~m/run/;print"String before matching:$\`\\n";print"Matched String:
$&\\n";print"Matched string:$'\\n";

The output result of executing the above program is:

String before matching: welcome to
Matching string: run
Matched string: oob site

Replace operator

Replace operator s/// is an extension of the matching operator to replace the specified string with a new string. The basic format is as follows:

s/PATTERN/REPLACEMENT/;

PATTERN is for matching pattern REPLACEMENT is the replacement string.

For example, we replace the “google” of the following string with “runoob”:

Example

#!/usr/bin/perl$string="welcome to google
site.";$string=~s/google/runoob/;print"$string\\n";

The output result of executing the above program is:

welcome to runoob site.

Replace operation modifier

The replacement operation modifier is shown in the following table:

Modifier

Description

I

If you add “I” to the modifier, the regular will remove case sensitivity, that is, “a” and “A” are the same.

M

The default regular start “^” and end “$” is only for regular strings ifyou add “m” to the modifier, then the beginning and end of each line willrefer to each line of the string: each line begins with “^” and ends with”$”.

O

The expression is executed only once.

S

If “s” is added to the modifier, the default “.” represents that any character other than the line break will become any character, including the line break!

X

If you add this modifier, the white space character in the expression will be ignored unless it has been escaped.

G

Replace all matching strings.

E

Replace the string as an expression

Conversion operator

The following are the modifiers related to the conversion operator:

Modifier

Description

C

Convert all unspecified characters

D

Delete all specified characters

S

Reduce multiple identical output characters to one

The following example sets the variable $string convert all lowercase letters in to uppercase letters:

#!/usr/bin/perl

$string = 'welcome to runoob site.';
$string =~ tr/a-z/A-z/;

print "$string\n";

The output result of executing the above program is:

WELCOME TO RUNOOB SITE.

The following examples use the /s change the variable $string duplicate character deletion:

Example

#!/usr/bin/perl$string='runoob';$string=~tr/a-z/a-z/s;print"$string\\n";

The output result of executing the above program is:

runob

More examples:

$string =~ tr/\d/ /c;     # Replace all non numeric characters with spaces
$string =~ tr/\t //d;     # Remove tabs and spaces
$string =~ tr/0-9/ /cs    # Replace other characters between numbers with a space.

More regular expression rules

Expression.

Description

.

Matches all characters except newline characters

x?

Match 0 or once x strings

x*

Match 0 or more x strings, but as many times as possible

x+

Match one or more x strings, but the least number of times possible

.*

Any character that matches 0 or more times

.+

Any character that matches one or more times

{m}

Matches a specified string that happens to be m

{m,n}

Match specified strings with more than m and less than n

{m,}

Match more than m specified strings

[]

Match match [] characters within

[^]

The match does not match [] characters within

[0-9]

Match all numeric characters

[a-z]

Match all lowercase characters

[^0-9]

Match all non-numeric characters

[^a-z]

Match all non-lowercase alphabetic characters

^

Matches a character at the beginning of a character

$

Matches the character at the end of the character

\d

Matches the character of a number, and [0-9] Grammar is the same

\d+

Matches multiple numeric strings, and [0-9] + the same syntax

\D

Non-numeric, other same asd

\D+

Non-numeric, other same asd +

\w

A string of letters or numbers, and [a-zA-Z0-9_] grammar is the same

\w+

And [a-zA-Z0-9_]+ grammar is the same

\W

A string that is not a letter or number, and [^a-zA-Z0-9_] grammar is the same

\W+

And [^a-zA-Z0-9_]+ grammar is the same

\s

Blank space, the syntax is the same as [\n\t\r\f]

\s+

The syntax is the same as [\n\t\r\f]+

\S

Non blank space, the syntax is the same as [^\n\t\r\f]

\S+

The syntax is the same as [^\n\t\r\f]+

\b

Match strings bounded by letters and numbers

\B

Match strings that are not bounded by letters and numeric values

a|b|c

Match strings that match a character or b character or c character

abc

Matching a string containing abc (pattern) () this symbol remembers the string you are looking for and is a useful syntax. The string found in the first () becomes $1 . This variable or \1 variable, the string found in the second () becomes $2 . This variable or2 variable, and soon.

/pattern/i

The parameter I ignores English case, that is, when matching strings, the case of English is not considered. if you are looking for a special character in pattern mode, such as * need to precede this characterwith \ symbols, so as to invalidate special characters

More referenc

Perl Regular expressions: https://perldoc.perl.org/perlre#Regular-Expressions

Powered by TorCMS (https://github.com/bukun/TorCMS).