Regular expression characters and usage options

Relevant for: GUI tests and components and API testing

This section describes some of the more common options that can be used to create regular expressions:

Backslash Character ( \ )

A backslash (\) can serve two purposes. It can be used in conjunction with a special character to indicate that the next character be treated as a literal character. For example, \. would be treated as period (.) instead of a wildcard. Alternatively, if the backslash (\) is used in conjunction with some characters that would otherwise be treated as literal characters, such as the letters n, t, w, or d, the combination indicates a special character. For example, \n stands for the newline character.

If the backslash character is not used for either of these purposes, it is ignored.

For example:

  • w matches the character w

  • \w is a special character that matches any word character including underscore

  • \\ matches the literal character \

  • \( matches the literal character (

  • on\etwo matches the string onetwo

For example, if you were looking for a Web site called www.advantageonlineshopping.com, the period would be mistaken as an indication of a regular expression. To indicate that the period is not part of a regular expression, you would enter it as www\.advantageonlineshopping\.com

Matching Any Single Character ( . )

A period (.) instructs UFT One to search for any single character (except for \n).

For example: welcome. matches welcomes, welcomed, or welcome followed by a space or any other single character. A series of periods indicates the same number of unspecified characters.

To match any single character including \n, enter (.|\n)

Matching Any Single Character in a List ( [xy] )

Square brackets instruct UFT One to search for any single character within a list of characters.

For example, to search for the date 1967, 1968, or 1969, enter 196[789]

Matching Any Single Character Not in a List ( [^xy] )

When a caret (^) is the first character inside square brackets, it instructs UFT One to match any character in the list except for the ones specified in the string.

For example [^ab] matches any character except a or b

The caret has this special meaning only when it is displayed first within the brackets.

Matching Any Single Character within a Range ( [x-y] )

To match a single character within a range, you can use square brackets ([ ]) with the hyphen (-) character.

For instance, to match any year in the 1960s, enter 196[0-9]

A hyphen does not signify a range if it is displayed as the first or last character within brackets, or after a caret (^).

For example, [-a-z] matches a hyphen or any lowercase letter.

Within brackets, the characters ".", "*", "[" and "\" are literal. For example, [.*] matches . or *. If the right bracket is the first character in the range, it is also literal.

Matching Zero or More Specific Characters ( * )

An asterisk (*) instructs UFT One to match zero or more occurrences of the preceding character.

For example, ca*r matches car, caaaaaar, and cr.

Matching One or More Specific Characters ( + )

A plus sign (+) instructs UFT One to match one or more occurrences of the preceding character.

For example ca+r matches car and caaaaaar, but not cr

Grouping Regular Expressions ( ( ) )

Parentheses (()) instruct UFT One to treat the contained sequence as a unit, just as in mathematics and programming languages.

Using groups is especially useful for delimiting the arguments to an alternation operator ( | ) or a repetition operator: ( * , + , ? , { } )

Matching One of Several Regular Expressions ( | )

Matching One of Several Regular Expressions ( | )

A vertical line (|) instructs UFT One to match one of a choice of expressions.

For example, foo|bar causes UFT One to match either foo or bar. By contrast, fo(o|b)ar causes UFT One to match either fooar or fobar

Matching the Beginning of a Line ( ^ )

A caret (^) instructs UFT One to match the expression only at the start of a line, or after a newline character.

For example, book matches book within the lines—book, my book, and book list, while ^book matches book only in the lines—book and book list

Matching the End of a Line ( $ )

A dollar sign ($) instructs UFT One to match the expression only at the end of a line.

For example book matches book within the lines—my book, and book list, while a string that is followed by (\n), (\r), or ($), matches only lines ending in that string.

For example book$ matches book only in the line—my book

Matching a Newline or Carriage Return Character ( \n ) or ( \r )

\n or \r instruct UFT One to match the expression only when followed by a newline or carriage return character.

  • \n instructs UFT One to match any newline characters.

  • \r instructs UFT One to match any carriage return characters.

For example, book matches book within the lines—my book, and book list

A string that is followed by (\n) or (\r) matches only lines that are followed by a newline or carriage return character.

For example, book\r matches book only when book is followed by a carriage return

Matching Any AlphaNumeric Character Including the Underscore ( \w )

\w instructs UFT One to match any alphanumeric character and the underscore (A-Z, a-z, 0-9, _).

For example, \w* causes UFT One to match zero or more occurrences of the alphanumeric characters—A-Z, a-z, 0-9, and the underscore (_). It matches Ab, r9Cj, or 12_uYLgeu_435.

For example, \w{3} causes UFT One to match 3 occurrences of the alphanumeric characters A-Z, a-z, 0-9, and the underscore (_). It matches Ab4, r9_, or z_M.

Matching Any Non-AlphaNumeric Character ( \W )

\W instructs UFT One to match any character other than alphanumeric characters and underscores.

For example, \W matches &, *, ^, %, $, and #

Matching a Decimal Digit ( \d )

\d instructs UFT One to match any decimal digit.

For example, \d matches 1, 2, 4, and 5

Matching a Non-Digit Character ( \D )

\D instructs UFT One to match any character other than a decimal digit.

For example, \D matches a, T, =, and +

Combining Regular Expression Operators

You can combine regular expression operators in a single expression to achieve the exact search criteria you need.

For example, you can combine the '.' and '*' characters to find zero or more occurrences of any character (except \n).

For example, start.* matches start, started, starting, starter

You can use a combination of brackets and an asterisk to limit the search to a combination of non-numeric characters. For example, [a-zA-Z]*

To match any number between 0 and 1200, you need to match numbers with 1 digit, 2 digits, 3 digits, or 4 digits between 1000-1200.

This regular expression matches any number between 0 and 1200: ([0-9]?[0-9]?[0-9]|1[01][0-9][0-9]|1200)

Note: For more details, see VBScript Reference and the cplusplus.com ECMAScript syntax.