jQuery selector source code interpretation (v) : tokenize parsing process

  • 2020-05-19 04:17:35
  • OfStack

The following analysis is based on the jQuery-1.10.2.js version.

Below is an example of how the pieces of tokenize and preFilter code work together to do the parsing, using $("div:not(.class :contain('span')):eq(3). For a detailed explanation of each line of code for the tokenize method and the preFilter class, see the following two articles:

//www.ofstack.com/article/63155.htm
//www.ofstack.com/article/63163.htm

Here is the source code for the tokenize method. For simplicity, I removed all the code for caching, comma matching, and relational matching, leaving only the core code relevant to the current example. The removed code is very simple, if you need to read 1 of the above articles can be.

In addition, code 1 is written above the caption.


function tokenize(selector, parseOnly) {
 var matched, match, tokens, type, soFar, groups, preFilters;
 
 soFar = selector;
 groups = [];
 preFilters = Expr.preFilter;  while (soFar) {
  if (!matched) {
   groups.push(tokens = []);
  }
  
  matched = false;   for (type in Expr.filter) {
   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
    matched = match.shift();
    tokens.push({
     value : matched,
     type : type,
     matches : match
    });
    soFar = soFar.slice(matched.length);
   }
  }   if (!matched) {
   break;
  }
 }  return parseOnly ? soFar.length : soFar ? Sizzle.error(selector) :
  tokenCache(selector, groups).slice(0);
}

First, jQuery is invoked by the select method for the first time during the execution of jQuery, and "div:not(.class :contain('span')):eq(3)" is passed into the method as the selector parameter.

 soFar = selector;

soFar = "div:not(.class:contain('span')):eq(3)"
The first time you enter the while loop, because matched has not yet been assigned a value, you execute the following statement body in if, which initializes the tokens variable and, at the same time, presses tokens into the groups array.


groups.push(tokens = []); 

After that, enter the for statement.

The first for loop: from Expr.filter, the first element "TAG" is assigned to the type variable, and the loop body code is executed.


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {

match = matchExpr[type]. The results of exec(soFar) are as follows:

match =["div", "div"]

The first selector in the example is div, which matches the regular expression of matchExpr["TAG"], and preFilters["TAG"] does not exist, so if inner body is executed.


matched = match.shift(); 

Remove the first element div from match and assign it to the matched variable, matched="div", match = ["div"]


    tokens.push({
     value : matched,
     type : type,
     matches : match
    }

Create a new object {value: "div", type:"TAG", matches: ["div"]} and press the object into the tokens array.


    soFar = soFar.slice(matched.length);

soFar variable delete div, at this point, soFar=":not(.class :contain('span')):eq(3)"
Second for loop: take the second element "CLASS" from Expr.filter and assign it to the type variable, and execute the loop body code.


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {

Because the current soFar=":not(.class:contain('span')):eq(3)" does not match the regular expression of type CLASS, the loop is closed.
The third for loop: from Expr.filter, the third element "ATTR" is assigned to the type variable, and the loop body code is executed.
Also, close the loop because the current remaining selector is not an attribute selector.

The fourth for loop: take the fourth element "CHILD" from Expr.filter and assign it to the type variable, and execute the loop body code.
Also, close the loop because the current remaining selector is not an CHILD selector.

Fifth for loop: take the fifth element "PSEUDO" from Expr.filter and assign it to the type variable, and execute the loop body code.


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {

match = matchExpr[type]. The result of exec(soFar) is as follows:
[":not(.class:contain('span')):eq(3)", "not", ".class:contain('span')):eq(3", undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined]

Since preFilters["PSEUDO"] exists, the following code is executed:


match = preFilters[type](match) 

The code for preFilters["PSEUDO"] is as follows:


 soFar = selector;
0

The incoming match parameter is equal to:


 soFar = selector;
1


unquoted = !match[5] && match[2] 

unquoted = ".class:contain('span')):eq(3"


if (matchExpr["CHILD"].test(match[0])) { 
    return null; 
}

match[0] = ":not(.class :contain('span')):eq(3)", does not match matchExpr["CHILD"] regular expression, return null statement is not executed.


 soFar = selector;
4

Since match[3] and match[4] are both equal to undefined, execute the body of else.


 soFar = selector;
5

At this point, unquoted = ".class :contain('span')):eq(3") is true, and since unquoted contains :contain('span'), which matches the regular expression rpseudo, rpseudo.test (unquoted) is true, then tokenize is called again to parse unquoted again, as follows:


 soFar = selector;
6

When the tokenize function is called, the selector parameter passed in is ".class:contain('span')):eq(3", parseOnly = true. The internal execution process of the function is as follows:


 soFar = selector;
7

soFar = ".class:contain('span')):eq(3"
The first time you enter the while loop, because matched has not yet been assigned, execute the following statement body in if, which initializes the tokens variable and, at the same time, presses tokens into the groups array.


groups.push(tokens = []); 

After that, enter the for statement.

The first for loop: from Expr.filter, the first element "TAG" is assigned to the type variable, and the loop body code is executed.


 soFar = selector;
9

Close the loop because the current remaining selector is not an TAG selector.
The second for loop: from Expr.filter, the second element "CLASS" is assigned to the type variable, and the loop body code is executed.

match = matchExpr[type]. The results of exec(soFar) are as follows:

match = ["class" , "class"]

Since preFilters["CLASS"] does not exist, the if inner body is executed.


matched = match.shift(); 

Remove the first element in match, class, and assign it to the matched variable, matched="class", match = ["class"]


tokens.push({ 
    value : matched, 
    type : type, 
    matches : match 

Create a new object {value: "class", type:"CLASS", matches: ["class"]} and press the object into the tokens array.


soFar = soFar.slice(matched.length); 

soFar variable delete class, at this point, soFar = ":contain('span')):eq(3")
The third for loop: from Expr.filter, the third element "ATTR" is assigned to the type variable, and the loop body code is executed.
Also, close the loop because the current remaining selector is not an attribute selector.

The fourth for loop: take the fourth element "CHILD" from Expr.filter and assign it to the type variable, and execute the loop body code.
Also, close the loop because the current remaining selector is not an CHILD selector.

The fifth for loop: from Expr.filter, the fifth element "PSEUDO" is assigned to the type variable, and the loop body code is executed.


 soFar = selector;
9

match = matchExpr[type]. The results of exec(soFar) are as follows:
[":contain('span')", "contain", "'span'", "'", "span", undefined, undefined, undefined, undefined, undefined, undefined]

Because of preFilters["PSEUDO"], execute the following code:


match = preFilters[type](match)

The preFilters["PSEUDO"] code is shown above and will not be listed here.


"PSEUDO" : function(match) { 
    var excess, unquoted = !match[5] && match[2]; 
 
    if (matchExpr["CHILD"].test(match[0])) { 
        return null; 
    } 
 
    if (match[3] && match[4] !== undefined) { 
        match[2] = match[4]; 
    } else if (unquoted 
            && rpseudo.test(unquoted) 
            && (excess = tokenize(unquoted, true)) 
            && (excess = unquoted.indexOf(")", unquoted.length 
                    - excess) 
                    - unquoted.length)) { 
 
        match[0] = match[0].slice(0, excess); 
        match[2] = unquoted.slice(0, excess); 
    } 
 
    return match.slice(0, 3); 

The incoming match parameter is equal to:
[":contain('span')", "contain", "'span'", "'", "span", undefined, undefined, undefined, undefined, undefined, undefined]


unquoted = !match[5] && match[2]; 

unquoted = "span"


groups.push(tokens = []); 
7

Since ":contain('span')" does not match the regular expression matchExpr["CHILD"], the internal body is not executed.


groups.push(tokens = []); 
8

Since match[3] =" '", match[4] ="span", if internal body is executed, entrusting "span" to match[2]


return match.slice(0, 3); 

Returns a copy of the first three elements of match
At this point, go back to the for loop of tokenize method to continue execution. At this point, the values of each variable are as follows:

match = [":contain('span')", "contain", "span"]

soFar = ":contain('span')):eq(3"


matched = match.shift(); 

Remove ":contain('span')" from the match array and assign the matched variable


tokens.push({ 
    value : matched, 
    type : type, 
    matches : match 


Create a new object {value:
":contain('span')", type:"PSEUDO", matches: ["contain", "span"]}, and press the object into the tokens array.


soFar = soFar.slice(matched.length); 

soFar variable delete ":contain('span')", at this point, soFar="):eq(3)", after that, for loop is completed, and while loop is executed again without a valid selector, so exit while loop.


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
3

Since parseOnly = true at this time, return the length of soFar at this time, and continue with the code for preFilters["PSEUDO"]


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
4

Assign 6 to the excess variable, which is then coded


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
5

Calculate :not selector end position (that is, close parenthesis position) 22


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
6

The complete :not selector string (match[0]) and its parenthesized string (match[2]) are calculated as:

match[0] = ":not(.class:contain('span'))"

match[2] = ".class:contain('span')"


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
7

Returns a copy of the first three elements in match.
Back to the tokenize function, match = [":not(.class:contain('span')), ""not", ".class:contain('span')"]


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
8

Remove the first element of match ":not(.class :contain('span') "and assign the element to the matched variable, matched="":not(.class :contain('span') "",
match = ["not", ".class:contain('span')"]


tokens.push({ 
    value : matched, 
    type : type, 
    matches : match 

Create a new object {value: ":not(.class :contain('span'))"", type:"PSEUDO", matches: ["not", ".class :contain('span')"]} and press the object into the tokens array. At this point, tokens has two elements: div and not selectors.


soFar = soFar.slice(matched.length); 

soFar delete ":not(.class :contain('span'))", at this point, soFar=":eq(3)", after the end of this for cycle, return to while cycle again, in the same way, get eq selector, the third element of tokens. The final groups results are as follows:
group[0][0] = {value: "div", type: "TAG", matches: ["div"] }

group[0][1] = {value: ":not(.class:contain('span'))", type: "PSEUDO", matches: ["not", ".class:contain('span')"] }

group[0][2] = {value: ":eq(3)", type: "PSEUDO", matches: ["eq", "3"] }


   if ((match = matchExpr[type].exec(soFar))
     && (!preFilters[type] || (match = preFilters[type]
       (match)))) {
3

Since parseOnly = undefined, tokenCache(selector, groups).slice (0) is executed, which presses groups into the cache and returns a copy of it.
So, having done all of the parsing, you might say, well, this second element here is not being parsed, and yes, this needs to be parsed again in practice. "class:contain('span')):eq('span')):eq(3") But that only speeds up the current run. This is because during execution, when the ".class :contain('span')" is committed again, it is stored in the cache.

At this point, the entire execution process is complete.


Related articles: