Quickly master several USES of regex in MongoDB in 3 minutes

  • 2020-10-31 22:02:29
  • OfStack

background

Part1: Write it first

Those of you who use MySQL or other relational databases know that the use of fuzzy queries is similar to:


SELECT * FROM products WHERE sku like "%789"; 

regex, described in this article, does just that, enabling you to use regular expressions in your queries. This article takes you through the use of regex in MongoDB with a simple example ~

The usage Part2:

When using $regex, there are several USES:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 

Meaning of the option parameter:

选项 含义 使用要求
i 大小写不敏感
m

查询匹配中使用了锚,例如^(代表开头)和$(代表结尾),以及匹配\n后的字符串


x

忽视所有空白字符

要求$regex与$option合用
s 允许点字符(.)匹配所有的字符,包括换行符。 要求$regex与$option合用

In actual combat

Part1: The use of $in

To include regular expressions in the $in query, you can only use the JavaScript regular expression object (that is, / pattern /). Such as:


{ name: { $in: [ /^acme/i, /^ack/ ] } } 

Warning: Warns against using the $regex operator expression in $in.

Part2: Implicit and usage

To include a regular expression in a comma-separated query condition, use the $regex operator. Such as:


{ name: { $regex: /acme.*corp/i, $nin: [ 'acmeblahcorp' ] } } 
{ name: { $regex: /acme.*corp/, $options: 'i', $nin: [ 'acmeblahcorp' ] } } 
{ name: { $regex: 'acme.*corp', $options: 'i', $nin: [ 'acmeblahcorp' ] } } 

Part3:x and s options

To use the x option or the s option, you require $regex to be used with $option. For example, to specify the i and s options, you must use $options to do the following:


{ name: { $regex: /acme.*corp/, $options: "si" } } 
{ name: { $regex: 'acme.*corp', $options: "si" } } 

Part4: Use of indexes

For case-sensitive regular expression queries, MongoDB matches the regular expression with the value in the index if the field has an index, which is faster than a full table scan. If the regular expression is a "prefix expression," you can optimize the query speed, and the query results will all start with the same string.

Regular expressions also conform to the left-most prefix principle, for example, regular expressions /^abc.*/ will be optimized by matching only index values beginning with abc.

Warning: warning

1. Although /^a/, /^a.*/ and /^a.*$/ match equivalent strings, their performance is not the same. If there is a corresponding index, all of these expressions use an index; However, /^a.*/ and /^a.*$/ are slower. This is because /^a/ can stop scanning after matching prefixes.

2. Case-insensitive regular expression queries generally cannot use indexes, and $regex cannot use case-insensitive indexes.

Examples of Part5:

In the collection of 1 commodity, the following contents are stored


{ "_id" : 100, "sku" : "abc123", "description" : "Single line description." } 
{ "_id" : 101, "sku" : "abc789", "description" : "First line\nSecond line" } 
{ "_id" : 102, "sku" : "xyz456", "description" : "Many spaces before line" } 
{ "_id" : 103, "sku" : "xyz789", "description" : "Multiple\nline description" } 

If you want to perform a query on the products collection for this commodity, the scope is that the contents of the sku column end at 789:


db.products.find( { sku: { $regex: /789$/ } } ) 

Combined with the understanding of MySQL, the above query in MySQL looks like this:


SELECT * FROM products WHERE sku like "%789"; 

If you want to query that sku starts with abc and ABC, ignoring case when matching, you can use the i option:


db.products.find( { sku: { $regex: /^ABC/i } } ) ,  

The query results are:


{ "_id" : 100, "sku" : "abc123", "description" : "Single line description." } 
{ "_id" : 101, "sku" : "abc789", "description" : "First line\nSecond line" } 

The use of Part6: m

To query descriptions that include S and match S after /n, add the m option


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
0

The results returned are:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
1

Without the m option, the result would look like this:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
2

If you do not use an anchor like ^, the full result will be returned:


db.products.find( { description: { $regex: /S/ } } ) 
{ "_id" : 100, "sku" : "abc123", "description" : "Single line description." } 
{ "_id" : 101, "sku" : "abc789", "description" : "First line\nSecond line" } 

The use of Part7: s

Using the s option to execute the query will result in a comma. Match all characters, including line breaks. The m column in description begins and is followed by the line string:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
4

If s is not included, it returns:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
5

The use of Part8: x

The following example USES the x option to ignore whitespace and comments, representing comments with #, and ending with \ n in a matching pattern:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
6

The results of the query are:


{ <field>: { $regex: /pattern/, $options: '<options>' } } 
{ <field>: { $regex: 'pattern', $options: '<options>' } } 
{ <field>: { $regex: /pattern/<options> } } 
7

As you can see, it ignores the whitespaces of abc and #category, as well as the whitespaces of #category and code, and actually executes a query where sku is the result of abc123.

conclusion

Through these cases, we learned about the regex usage in MongoDB, as well as the meaning and usage of each option of its optional parameter $option. Due to the limited level of the author, the writing time is also very hasty, the article will inevitably appear 1 mistakes or inaccurate, inappropriate places ask readers to criticize and correct.


Related articles: