Skip to content

Searching for Phrases and Words

MQL supports searching for a phrase or a word using the following text matchers:

Matcher Example Description
Word shipment Search for the word "shipment" in a transcript
Quoted Term "problem with a shipment" Search for the exact phrase "problem with a shipment" in a transcript
Simple (wildcard) pattern ship* Search for the words that begin with "ship", like "shipment", "shipping", as well as "ship"
Regex pattern R"ship(ment|ping)" Search for the words matching the regular expression, in this example, it matches the words "shipping" and "shipment". Note, with such a regular expression, ship doesn't have to be at the beginning of the word, i.e. it will match the word pre-shipment as well. To enforce a match on word boundaries, add \b (boundary of word) to the regular expression, for example, R"\bship(ment|ping)\b".

Word and Quoted Term matches

A Quoted Term matcher requires that the text matches literally to the search term. This is best demonstrated in the following examples.

"cancel order"

This Quoted Term expression matches the phrase "cancel order", but not "cancel my order".

To overcome such a limit, you can use the operator OR to list all the variants of a phrase, like:

"cancel order" OR "cancel my order" OR "cancel this order"

cancel ONEAR:5 order

This expression consists of two Word matchers (the words cancel and order), with a proximity operator ONEAR:5 between them.

The ONEAR:5 operator instructs the search engine to find the word "cancel" followed by the word "order", with a distance between these two words no more than 5.

Such an expression matches phrases "cancel order", "cancel my order", "cancel my recent order", etc.

Note 1. The operator ONEAR is order-dependent, i.e. the word "cancel" must appear in a transcript before the word "order". There is an alternative operator NEAR that is order-independent.

Note 2. The operator ONEAR:5 can be omitted because it is a default operator in MQL expressions, i.e. the expression cancel order is the same as cancel ONEAR:5 order.

cancel NEAR:5 order

This expression uses an order-independent operator NEAR, which instructs the search engine to find the words "cancel" and "order", that appear in a transcript close to each other (distance up to 5 words), the order of appearance of the searched words is not important.

Such an expression matches "cancel my order" as well as "order that I want to cancel", where words appear in reverse order.

Case insensitiveness

Both Word and Quoted Term matchers are case-insensitive, e.g. the expression order will match "order", "Order", "ORDER" and "oRDER".

Escaping a quote character

To search for a quote symbol (") literally, repeat it twice "".

Example:

"foo "" bar"

This expression matches the phrase foo " bar.

Note, the double "" is supported in a Quoted Term only. It is a syntax error to use a quote inside a Word matcher, like foo""bar, but it is ok to use it in a Quoted Term matcher, like "foo""bar".

Simple (wildcard) pattern

A Word matcher supports wildcard pattern matching.

The following table describes wildcard patterns, listing the pattern and its use.

Pattern Use Example
* Match zero or more characters bl* matches bl, black, blue, and blob
? Match exactly one occurrance of any character h?t finds hot, hat, and hit
[abc] Match one occurance of the characters a, b, or c. h[oa]t finds hot and hat, but not hit
[!az] Match any characters except a or z h[!oa]t finds hit, but not hot and hat
[a-c] Match one occurance of a character between a and c c[a-c]t finds cat and cbt, but not cut

Note

The wildcard patterns are supported in a Word matcher only. A Quoted Term interprets those symbols literally. For example bl* is a pattern match, but "bl*" is the exact match.

Such a difference between Word and Quoted Term matchers is useful when you need to search for one of the wildcard symbols literally in a text.

For example, to find an exclamation point in a text, use a Quoted Term expression, like "Great!"

Regular expression (REGEX) pattern

To match complex text patterns, use Regular expressions (REGEX).

The regular expression must be enclosed into R" and " characters. Examples:

Pattern Use
R"[0-9]+ Match one of more digits in a text
R"ship(ment|ping) Match words "shipment" and "shipping"

To match a quote (") character in a regular expression, include it twice, like R"foo""bar".

MiaRec supports standard regular expression patterns.

A regular expression may use any of the following metacharacters:

.

Matches any single character. For example:

a.c

... will match "abc", but not "ac" or "abbc"

[]

A bracket expression. Matches a single character that is contained within the brackets. For example:

[abc]

... will match "a", "b" or "c".

[hc]at

... will match "hat" and "cat".

A - character between two other characters forms a range that matches all characters from the first character to the second. For example:

[0-9]

... will match any decimal digit.

[a-z]

... will match any lowercase letter from "a" to "z".

These forms can be mixed:

[abcx-z]

... will match "a", "b", "c", "x", "y" or "z".

To include a literal - character, it must be written first or last, for example, [abc-], [-abc].

To include a literal ] character, it must immediately follow the opening bracket [, for example, []abc].

[^ ]

Matches a single character that is not contained within the brackets. For example:

[^abc]

... will match any character other than "a", "b", or "c".

[^a-z]

... will match any single character that is not a lowercase letter from "a" to "z".

As above, literal characters and ranges can be mixed, like [^abcx-z]

*

Matches the preceding element zero or more times. For example:

a*c

... will match "ac", "abc", "abbbc" etc.

[0-9]*
... will match "" (empty string), "0", "1", "2", "14", "502", "98541654", and so on (any combination of digits).

( )*

Matches zero of more instances of the characters sequence, specified inside parentheses. For example:

(ab)*

... will match "", "ab", "abab", "ababab", and so on.

(1234)*

... will match "", "1234", "12341234", "123412341234", and so on.

+

Matches the preceding element one or more times. For example:

ba+

... will match "ba", "baa", "baaa", and so on.

0[0-9]+

... will match "00", "01", "02", "001", "01234", "09876543210", or any other combination of digits with preceding 0 and minimum length equal to two characters.

?

Matches the preceding element zero or one time. For example:

ba?

... will match "b", or "ba", but not "baa"

0[0-9]?

... will match "0", "01", "02", "03", and so on.

|

The choice (aka alternation or set union) operator matches either the expression before or the expression after the operator. For example:

abc|def

... will match "abc" or "def".

(0|011)[1-9]+

... will match phone number, which starts with either 0 or 011.

{n}

Matches the preceding element exactly n times. For example:

a{3}

... will match "aaa", but not "a", "aa" or "aaaa"

[0-9]{5}

... will match "01234", "56789" or any other combination of digits, which has lenght 5 characters.

{m, n}

Matches the preceding element at least m and not more than n times. For example:

a{3,5}

... will match "aaa", "aaaa", "aaaaa", but not "aa" or "aaaaaaaa".

{m, }

Matches the preceding element at least m times. For example:

a{2,}

... will match "aa", "aaa", "aaaa", and so on.

^

Matches the beginning of a string. For example:

^[hc]at

... will match "hat" and "cat", but only at the beginning of the string

$

Matches the end of a string. For example:

[hc]at$

... will match "hat" and "cat", but only at the end of the string

^[hc]at$

... will match "hat" and "cat", but only when the string contains no other characters

\

Backslash (\) character is used for escaping metacharacters. For example:

1+2

... will match "12", "112", "11112", but not "1+2", because "plus" character has a special meaning (see above).

1\+2

... will match exactly "1+2". In this example, "plus" character is escaped with backslash character (\+).