Grouping matching detail for JavaScript regular expressions

  • 2020-12-20 03:27:21
  • OfStack

grouping

The following regular expression matches kidkidkid:


/kidkidkid/

And a more elegant way of writing it is:


/(kid){3}/

The 1 small whole enclosed by parentheses here is called a grouping.

The candidate

In 1 group, there can be multiple candidate expressions separated by | :


var reg = /I love (him|her|it)/;

reg.test('I love him')  // true 
reg.test('I love her')  // true
reg.test('I love it')  // true
reg.test('I love them') // false

The word | here means "or".

Capture and reference

Strings that are matched (captured) by the regular expression are stored temporarily. Where, the strings captured by the group will be numbered starting at 1, so we can refer to these strings:


var reg = /(\d{4})-(\d{2})-(\d{2})/
var date = '2010-04-12'
reg.test(date)

RegExp.$1 // 2010
RegExp.$2 // 04
RegExp.$3 // 12

$1 refers to the first string captured, $2 is the second, and so on.

Cooperate with replace

The captured string can be directly referenced in the pass argument of the String.prototype.replace method. For example, we want to change the date of 12.21/2012 to 2012-12-21:


var reg = /(\d{2}).(\d{2})\/(\d{4})/
var date = '12.21/2012'

date = date.replace(reg, '$3-$1-$2') // date = 2012-12-21

By the way, passing iterative functions to replace can sometimes solve some problems elegantly.

Converting prohibited words to an asterisk of the same number of words is a common feature. For example, the text is kid is a doubi, where kid and doubi are prohibited words, then after conversion, it should be *** is a ***** *. We could write it this way:


var reg = /(kid|doubi)/g
var str = 'kid is a doubi'

str = str.replace(reg, function(word){
  return word.replace(/./g, '*')
})

Capture of nested groups

If a nested group like /((kid) is (a (doubi))/is encountered, what is the order of capture? To try:


var reg = /((kid) is (a (doubi)))/
var str = "kid is a doubi"

reg.test( str ) // true

RegExp.$1 // kid is a doubi
RegExp.$2 // kid
RegExp.$3 // a doubi
RegExp.$4 // doubi

Rules are captured in the order in which the left parenthesis appears.

backreferences

References can also be made in regular expressions, which are called back references:


var reg = /(\w{3}) is \1/

reg.test('kid is kid') // true
reg.test('dik is dik') // true
reg.test('kid is dik') // false
reg.test('dik is kid') // false

\1 refers to the first string captured by grouping, in other words, the expression is dynamically determined.

Note that if the number crosses the line, it will be treated as a normal expression:


var reg = /(\w{3}) is \6/;

reg.test( 'kid is kid' ); // false
reg.test( 'kid is \6' );  // true

Type of group

There are four types of grouping:

Capture - ()
Non-capture type -(? :)
Forward forward type -(? =)
Reverse forward-looking -(? !).
All we've talked about before is capture grouping, only this kind of grouping will hold the string that matches.

Non-capture grouping

Sometimes we just want to group requirements without capturing them, so we can use non-captured groups, followed by an open parenthesis followed by ? : :


var reg = /(?:\d{4})-(\d{2})-(\d{2})/
var date = '2012-12-21'
reg.test(date)

RegExp.$1 // 12
RegExp.$2 // 21

In this example, (? :\d{4}) grouping does not capture any strings, so $1 is the string captured by (\d{2}).

Forward and backward forward grouping

It's like you're standing there and looking ahead:

Forward forward grouping - What's in front of you?
Negative forward grouping - isn't there something in front of you?
It's such a mouthful, I like to call it positive and negative expressions. Let me give you an example of forward thinking:


/(kid){3}/
0

kid is a what follows? Only if it is doubi can the match be successful.

The negative forward is just the opposite:


/(kid){3}/
1

If the forward-looking grouping does not capture the value. So what's the difference between it and the non-capture type? See the examples:


/(kid){3}/
2

As you can see, strings that are matched by non-capture groups are still captured by the outer capture group, but not by the forward-looking group. Forward-looking grouping comes in handy when you need to refer to the following value but don't want to capture it in 1.

Finally, JS does not support post-prospective grouping.


Related articles: