| Meta-Character | Description |
| ^ | This meta-character, the caret, matches the beginning of a string or, if the /m option isused, match the beginning of a line. It is one oftwo pattern anchors, the other anchor is the $. |
| . | This meta-character will match any single character except for the newline character unless the /s option is specified. If the /s option is specified, then the newline will also be matched. |
| $ | This meta-character will match the end of a string or,if the /m option is used, match the end of a line.It is one of two pattern anchors; the other anchoris the ^. |
| | | This meta-character, called alternation, lets you specify two values that can cause the match to succe|ed. For instance, m/a|b/ means that the $_variable must contain the "a" or "b" character forthe match to succeed. |
| * | This meta-character indicates that the "thing" immediately to the left should be matched zero or more times in order to be evaluated as true (thus .*matches any number of characters). |
| + | This meta-character indicates that the "thing" immediately to the left should be matched one or more times in order to be evaluated as true. |
| ? | This meta-character indicates that the "thing" immediately to the left should be matched zero or one times to be evaluated as true. When used inconjunction with the +, ?, or {n, m} meta-characters and brackets, it means that the regular expression should be non-greedy and match the smallest possible string. |
| Meta-Brackets | Description |
| () | The parentheses let you affect the order of pattern evaluation and act as a form of pattern memory. See the "Special Variables" chapter for moredetails. |
| (?...) | If a question mark immediately follows the left parentheses, it indicates that an extended mode component is being specified; this is new to Perl 5. |
| (?#comment) | Extension: comment is any text. |
| (?:regx) | Extension: regx is any regular expression but () are not saved as a backreference. |
| (?=regx) | Extension: Allows matching of zero-width positive lookahead characters (that is, the regular expression is matched but not returned as being matched). |
| (?!regx) | Extension: Allows matching of zero-width negative lookahead characters (that is, negated form of (=regx)). |
| (?options) | Extension: Applies the specified options to the pattern bypassing the need for the option to specified in the normal way. Valid options are: i (case insenstive), m (treat as multiple lines), s (treat as single line), and x (allow whitespace and comments). |
| {n, m} | Braces let you specify how many times the "thing" immediately to the left should be matched. {n} means that it should be matched exactly n times. {n,} means it must be matched at least n times. {n, m} means that it must be matched at least n times but not more than m times. |
| [] | Square brackets let you create a character class. For instance, m/[abc]/ evaluates to True if any of "a", "b", or "c" is contained in $_. The square brackets are a more readable alternative to the alternation meta-character. |
| Meta-Sequences | Description |
| \ | This meta-character "escapes" the character which follows. This means that any special meaning normally attached to that character is ignored. For instance, if you need to include a dollar sign in a pattern, you must use \$ to avoid Perl's variable interpolation. Use \\ to specify the backslash character in your pattern. |
| \nnn | Any octal byte where nnn represents the octal number; this allows any character to be specified by its octal number. |
| \a | The alarm character; this is a special character which, when printed, produces a warning bell sound. |
| \A | This meta-sequence represents the beginning of the string. Its meaning is not affected by the /m option. |
| \b | This meta-sequence represents the backspace character inside a character class; otherwise, it represents a word boundary. A word boundary is the spot between word (\w) and non-word (\W) characters. Perl thinks that the \W meta-sequence matches the imaginary characters of the end of the string. |
| \B | Match a non-word boundary. |
| \cn | Any control character where n is the character (for example, \cY for Ctrl+Y). |
| \d | Match a single digit character. |
| \D | Match a single non-digit character. |
| \e | The escape character. |
| \E | Terminate the \L or \U sequence. |
| \f | The form feed character. |
| \G | Match only where the previous m//g left off. |
| \l | Change the next character to lowercase. |
| \L | Change the following characters to lowercase until a \E sequence is encountered. |
| \n | The newline character. |
| \Q | Quote regular expression meta-characters literally until the \E sequence is encountered. |
| \r | The carriage return character. |
| \s | Match a single whitespace character. |
| \S | Match a single non-whitespace character. |
| \t | The tab character. |
| \u | Change the next character to uppercase. |
| \U | Change the following characters to uppercase until a \E sequence is encountered. |
| \v | The vertical tab character. |
| \w | Match a single word character. Word characters are the alphanumeric and underscore characters. |
| \W | Match a single non-word character. |
| \xnn | Any hexadecimal byte. |
| \Z | This meta-sequence represents the end of the string. Its meaning is not affected by the /m option. |
| \$ | The dollar character. |
| \@ | The ampersand character. |
| \% | The percent character. |
Subscribe to:
Post Comments (Atom)
Posted by Technology FreakWednesday, January 16, 2008 at 1:43 AM
0 comments | Blog this! | Email to friend!
Print this post! | Labels: Meta-Brackets, Meta-Characters, Meta-Sequences, Regular Expressions
0 comments:
Write a Comment!