The optimization techniques for loop statements in JavaScript are discussed in depth

  • 2020-03-30 03:13:09
  • OfStack

Loops are one of the most important mechanisms in all programming languages, and almost any meaningful computer program (sort, query, etc.) does not have loops. Loop is also a very troublesome part in program optimization. We often need to constantly optimize the complexity of the program, but we are entangled in the choice between time complexity and space complexity due to the loop.

In javascript, there are three native loops, for () {}, while () {}, and do {} while (), the most common of which is for () {}.

However, for is one of the loops that javascript engineers are most likely to overlook when optimizing their programs.

Let's review the basics of for.
Javascript's for syntax is inherited from the c language, and the basic syntax of the for loop is used in two ways.

1. Loop arrays

Basic syntax for a for loop


for ( 2 2  ) {
  //. Logic code
}

We illustrate this in detail with an example piece of code.


var array = [1, 2, 3, 4, 5];
var sum   = 0;
for (var i = 0, len = array.length; i < len; ++i) {
  sum += array[i];
}
console.log('The sum of the array's items is %d.', sum);
//=> The sum of the array's items is 15.

In this code, we first define and initialize an array to store the items to be added and a sum integer variable. Next, we start the loop. In the initialization code of the for loop, we also define and initialize two variables: I (counter) and len(alias of the loop array length). After each logical execution, I increments by 1.

In the logical code of the loop, we add the array items of the current loop to the summation variable.
This cycle is represented by a flow chart as follows:

< img SRC = "border = 0 / / files.jb51.net/file_images/article/201406/201466103914652.jpg? 201456103932 ">

From this flowchart, we can easily find that the real loop body in the program not only has our logic code, but also contains the execution judgment and loop processing to realize the loop itself.
In this way, our optimization ideas are clear, we can optimize from four aspects.

1. The initialization code before the loop body
2. Execution judgment conditions in the loop body
3. Logical code
4. Processing code after logical code

Ps: there is an important relationship between the first and second points.


1.1 optimize the initialization code and execution judgment conditions

Let's start with a piece of code that you're all familiar with.


// wrong!
for (var i = 02 i < list.length2 ++i) {
  //. Logic code
}

I'm sure most engineers who write javascript still use this seemingly perfectly normal loop method, but why am I saying it's wrong here?
Let's take everything apart from the loop and see:

1. Initialization code - this loop defines and initializes only one counter variable.
2. Execute the judgment condition - this is true if the counter is less than the length of the list.
3. Handling code - counter increment 1.

Let's review the flowchart above again. What do we see?
The real body of the loop contains not only our logic code, but also the execution judgment and processing code that implements the loop itself. In other words, I < The list. Length criterion is executed before each loop. In javascript, a query is required to read an object's properties or methods.
You seem to understand something? There are two operations for this judgment: 1. Query the length attribute from the list array; 2. 2. Compare the sizes of I and list.length.
Assuming that the list array contains n elements, the program needs to perform 2n operations in the execution judgment of this loop.

If we change the code to look like this:


// Well
for (var i = 0, len = list.length; i < len; ++i) {
  //...
}

In this improved code, we've added a len variable to the initialization code before the loop body executes to store the values of list.length (more on variables, expressions, Pointers, and values in the second article). In this way, we do not need to query the properties of the list array again, with half the operands, in the execution judgment in the loop body.

Above steps we have improved the time complexity of the algorithm, but if we want to continue to optimize the space complexity, how to do? If your logic is not constrained by the order of the loop, you can try the following optimizations.


for (var i = list.length - 1; i >= 0; --i) {
  //...
}

This code loops forward by reversing the loop order and starting the I counter at the last element index (list.leng-1). In order to achieve the loop to reduce the number of variables to 1, and in the execution of the judgment, reduce the number of variable queries, reduce the time before the execution of CPU instructions.

1.2 optimize the logic code

In a loop, we get the current array element of the loop in order to do something with it or with it, which inevitably leads to several calls to the element.


var array = [
  { name: 'Will Wen Gunn', type: 'hentai' },
  { name: 'Vill Lin', type: 'moegril' }
];
for (var i = array.length - 1; i >= 0; --i) {
  console.log('Name: %s', array[i].name);
  console.log('He/She is a(n) %s', array[i].type);
  console.log('rn');
}
/*=>
  Name: Vill Lin
  He/She is a(n) moegril
  Name: Will Wen Gunn
  He/She is a(n) hentai
 */

In this code, the program needs to query the name and type attributes of each array element. If the array has n elements, the program makes 4n object queries.


1. array[i]
2. array[i].name
3. array[i]
4. array[i].type

I'm sure you've come up with a solution: assign the value of the current array element to a variable and use it in your logical code.


var array = [
  { name: 'Will Wen Gunn', type: 'hentai' },
  { name: 'Vill Lin', type: 'moegril' }
];
var person = null;
for (var i = array.length - 1; i >= 0 && (person = array[i]); --i) {
  console.log('Name: %s', person.name);
  console.log('He/She is a(n) %s', person.type);
  console.log('rn');
}
person = null;

It certainly looks better this way.


 1. array[i] => var person
 2. person.name
 3. person.type

It's a bit like foreach in emcascript5, but there's a big difference between the two, so I won't explain it here.

Ps: thank you for your correction. According to the experiment, if the elements in the array are defined by passing values directly, the values obtained in the loop must be values, not Pointers. So whether you define an expression or a variable, there are additional memory space requests.

1.3 optimize the processing code

In fact, there isn't much to optimize in the processing code in the loop body, so the I counter is just increment by 1.
Ps: if you have any good Suggestions or methods, you are welcome to provide them. :)

2. Loop object

In javascript, for can also iterate over the properties and methods of an object. Note that the for loop cannot traverse the wrapper type of the object or the prototype properties and methods in the constructor.

The syntax is simpler than looping arrays.


for ( var key in object) {
  //. Logic code
}

We often use this method to manipulate objects.


var person = {
  'name'  : 'Will Wen Gunn',
  'type'  : 'hentai',
  'skill' : ['Programming', 'Photography', 'Speaking', 'etc']
};
for (var key in person) {
  value = person[key];
  // if the value is array, convert it to a string
  if (value instanceof Array) {
    value = value.join(', ');
  }
  console.log('%s: %s', key, value);
}

  If you've ever used mongodb, you're no stranger to its query mechanism. Since mongodb's query mechanism is like the soul of its API, the flexible curd operation mode has won a lot of popularity and development momentum for mongodb.

In the implementation of nanodb's mongo API, query is implemented in a way that makes extensive use of circular objects.


var myDB   = nano.db('myDB');
var myColl = myDB.collection('myColl');
var _cursor = myColl.find({
  type     : 'repo',
  language : 'JavaScript'
});
_cursor
  .sort({
    star: 1
  })
  .toArray(function(err, rows) {
    if (err)
      return console.error(err);
    console.log(rows);
  });

What we need to optimize is not the loop itself, but the objects that you need to iterate over.
For example, the nanocollection class in nanodb looks like an array, holding all elements, or an object, with the id of the element as the key, and then storing the elements.

But that is not the case, and students who have used underscores should know the _. Invert method. This is a rather interesting way of inverting the keys and values of the objects passed in.


var person = {
  'name' : 'Will Wen Gunn',
  'type' : 'hentai'
};
var _inverted = _.invert(person);
console.log(_inverted);

If you need to use a loop object to query the values of certain properties of an object, you can try the following method.


var person = {
  'name' : 'Will Wen Gunn',
  'type' : 'hentai'
};
var name = 'Will Wen Gunn';
var _inverted = _.invert(person);
if (_inverted[name] === 'name') {
  console.log('Catched!');
}
//=> Catched!

However, there is not much room for optimization in using for to query objects, and everything still needs to start from the actual requirements. : p

Let's look at the other two loops, while () {} and do {} while (). Anyone who has taken a computer science course will be familiar with both cycles. The only difference between them is the logical order in which the loop body is executed.

While () {} executes in a similar order to for () {}, executing the judgment before the logical code, but without initializing and processing the code.
When a condition is given, the logic code is executed until the condition is no longer valid.


var sum = 0;
while (sum < 10) {
  sum += sum + 1;
}
console.log(sum);
//=> 15

Do {} while () puts the execution behind the logical code.


var sum = 0;
do {
  sum += sum + 1;
} while (sum < 10);
console.log(sum);
//=> 15

While () {} and do {} while () also do not require a counter, but instead rely on certain conditions to determine whether to execute or to continue executing the logical code.

3. While () {} and do {} while ()

While () {} and do {} while () are primarily used in business logic to perform a continuous series of operations, such as a task queue, to achieve a goal.

But these two loops are dangerous because by default they are only controlled by the execution condition, and if there is no impact on the execution judgment in the logic code, there will be a dead loop.


var sum = 02
// warning!
while (sum < 10) {
  sum = 1 + 12
}

Such code is no different from while (true) {}, so you must specify the execution condition and the method that affects the execution condition before using it.

4. Use circular control statements

I believe all javascript engineers have used the break statement, but the continue statement is relatively rare. In fact, continue can be found in many excellent javascript open source projects.

To understand what the continue statement does, let's look at an example piece of code


// Node.js Broadcast Server
var net  = require('net');
var util = require('util');
var broadcastServer = net.createServer();
// Client Store
broadcastServer.clients = [];
// Clients Broadcast Method
net.Socket.prototype.broadcast = function(msg) {
  var clients = broadcastServer.clients;
  //Gets the subscript of the client that is broadcasting the pool
  var index   = clients.indexOf(this);
  for (var i = clients.length - 1; i >= 0; --i) {
    if (i === index) {
      //If it is the client that broadcasts the wok, the current body of the loop ends
      continue;
    }
    currClient = clients[i];
    if (!currClient.destroyed) {
      currClient.write(
        util.format(
          'r[Echo Client %s:%d] %snInput: ',
          currClient.remoteAddress, currClient.remotePort, msg)
      );
    }
  }
};
// A new client connected
broadcastServer.on('connection', function(client) {
  broadcastServer.clients.push(client);
  // Welcome
  client.write('[Broadcast Server] Welcome!nInput:');
  client.broadcast(client, 'Joined!');
  // Message handle
  client.on('data', function(msg) {
    client.broadcast(msg);
    client.write('rInput:');
  });
  // Disconnect handle
  client.on('end', function() {
    client.broadcast('Left!');
  })
});
// Bind
broadcastServer.listen(8080, function() {
  console.log('Broadcast Server bound.');
});

This code implements a broadcast server based on the node.js net module. In the broadcast method, we use the continue statement to send information to all connected clients except the one that broadcasts the contents.

Content of the code is fairly simple, when a client needs to � to other clients, the radio, then call the client for the client object method of broadcast, the broadcast method, the program will first gets the current client in the cache the client socket location of a set of subscript, and then to send � cycle of all client socket, when before the loop counter to obtain the location of the subscript, skip the current logic code in the loop body, continue to the next cycle.

Engineers who have studied the c/c++ language are likely to get this advice from a variety of sources: "don't use goto statements."

The infamous goto statement is actually a code flow controller, and the details of the goto statement will not be explained here. However, javascript has no obvious goto statement, but from the break statement and the continue statement, it is not difficult to find the shadow of goto in javascript.

This is because the break and continue statements allow you to accept a label name defined by a code jump.

Let's look at the instance code provided by the MDN.


var i, j;
loop1:
for (i = 0; i < 3; i++) {      //The first for statement is labeled "loop1"
   loop2:
   for (j = 0; j < 3; j++) {   //The second for statement is labeled "loop2"
      if (i == 1 && j == 1) {
         continue loop1;
      } else {
         console.log("i = " + i + ", j = " + j);
      }
   }
}
// Output is:
//   "i = 0, j = 0"
//   "i = 0, j = 1"
//   "i = 0, j = 2"
//   "i = 1, j = 0"
//   "i = 2, j = 0"
//   "i = 2, j = 1"
//   "i = 2, j = 2"
// Notice how it skips both "i = 1, j = 1" and "i = 1, j = 2"

In this instance code, two layers of loops are implemented, and a label is defined outside each loop for subsequent calls to the continue statement.

The first layer loops through the loop1 label, which means that later in the program, if the loop1 label is selected in the continue statement or break statement, the outermost loop is skipped.
The second loop is in the loop2 label in the top loop. If the loop2 label is selected in the continue statement or break statement, it returns to the loop body of the top loop.

By using loop control statements, we can interfere with the original loop execution decisions so that we can build very complex logic systems. As an aside, there are a lot of goto statements in the Linux kernel, and as to why you still hear a lot about not using goto, Google it.

5. Advanced loops

5.1 expand the loop

Let's take a look at two pieces of code first, and guess which one performs better.


// Setup
var array = [
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"],
  ["DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA", "DATA"]
];
function process(item) {
  // Do something with item
}
// Case 1
for (var i = array.length - 1; i >= 0; i--) {
  for (var j = array[i].length - 1; j >= 0; i--) {
    process(array[i][j]);
  }
}
// Case 2
for (var i = array.length - 1; i >= 0; i = i - 4) {
  for (var j = array[i].length - 1; j >= 0; j = j - 6) {
    process(array[i][j]);
    process(array[i][j - 1]);
    process(array[i][j - 2]);
    process(array[i][j - 3]);
    process(array[i][j - 4]);
    process(array[i][j - 5]);
  }
  for (var j = array[i - 1].length - 1; j >= 0; j = j - 6) {
    process(array[i][j]);
    process(array[i][j - 1]);
    process(array[i][j - 2]);
    process(array[i][j - 3]);
    process(array[i][j - 4]);
    process(array[i][j - 5]);
  }
  for (var j = array[i - 2].length - 1; j >= 0; j = j - 6) {
    process(array[i][j]);
    process(array[i][j - 1]);
    process(array[i][j - 2]);
    process(array[i][j - 3]);
    process(array[i][j - 4]);
    process(array[i][j - 5]);
  }
  for (var j = array[i - 3].length - 1; j >= 0; j = j - 6) {
    process(array[i][j]);
    process(array[i][j - 1]);
    process(array[i][j - 2]);
    process(array[i][j - 3]);
    process(array[i][j - 4]);
    process(array[i][j - 5]);
  }
}

I need to go through all the elements of the subarray in the array, either the way we normally do it, or by expanding the loop. The answer is that case 2 performs better because the execution judgment between every six elements is removed, which is naturally faster than usual.

Here's a more powerful solution. If a large data set needs to be iterated over in a business process, and the data set is iterated from the beginning and the data volume does not change, consider using a technique called a duff device. Named after its creator, Tom duff, the technology was first implemented in c. It was later ported to javascript by Jeff greenberg and modified by Andrew b. king to come up with a more efficient version.


//credit: Speed Up Up Your Site (New Riders, 2003)
var iterations = Math.floor(values.length / 8);
var leftover = values.length % 8;
var i = 0;
if (leftover > 0) {
  do {
    process(values[i++]);
  } while (--leftover > 0);
}
do {
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
  process(values[i++]);
} while (--iterations > 0);

This technique works by calculating the length of values divided by 8 to get the number of iterations needed, and using the math.floor() function to ensure that the result is an integer, and then calculating the number of rows that are not divisible by 8, and working with the elements separately, iterating 8 times with the number of single expansions.

I repackaged the device to get an API with an asynchronous flavor.


function duff(array, mapper) {
  var n = Math.floor(array.length / 8);
  var l = array.length % 8;
  var i = 0;
  if (l > 0) {
    do {
      mapper(array[i++]);
    } while (--i > 0);
  }
  do {
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
    mapper(array[i++]);
  } while (--n > 0);
}
duff([...], function(item) {
  //...
});

Here is a set of performance tests and their results for each of the three iterative solutions. http://jsperf.com/spreaded-loop

5.2 non-native cycles

In any programming language, loops can be implemented indirectly, not just in native loop statements provided by the language.

Let's start with a little bit of high school math -- the general term formula for a sequence.


bacause
  a[1] = 1
  a[n] = 2 * a[n - 1] + 1
so
                   a[n] + 1 = 2 * a[n - 1] + 2
                            = 2 * (a[n - 1] + 1)
(a[n] + 1) / (a[n - 1] + 1) = 2
then
  a[n] + 1 = (a[n] + 1) / (a[n - 1] + 1) * (a[n - 1] + 1) / (a[n - 2] + 1) * ... * (a[2] + 1) / (a[1] + 1) * (a[i] + 1)
  a[n] + 1 = 2 * 2 * ... * 2 * 2
  a[n] + 1 = 2^n
      a[n] = 2^n - 1
final
  a[n] = 2^n - 1

You can probably guess what we're going to be talking about from the simple calculus above. Yes, we can also use recursion to implement loops.

Recursion is a very important application method in mathematics and computer science. It refers to a function calling itself when it is used.

In the node.js community, recursion is used to implement a very important technology: middleware technology. This is a new version of the middleware implementation code in webjs that is not yet publicly available.



server.runMiddlewares = function(url, req, res, out) {
  var index = -1;
  var middlewares = this._usingMiddlewares;
  // run the next middleware if it is exists
  function next(err) {
    index++;
    // current middleware
    var curr = middlewares[index];
    if (curr) {
      var check = new RegExp(curr.route);
      // Check the route
      if (check.test(url)) {
        try {
          function later() {
            debug('A middleware says it need to be later on %s', url);
            // The dependencies do not right now
            if (middlewares.indexOf(curr) !== middlewares.length - 1) {
              _later(curr);
              index--;
              next();
            } else {
              debug('A middleware dependencies wrong');
              // This middleware can not run
              out();
            }
          }
          // Run the middleware
          if (utils.isFunc(curr.handler)) {
            // Normal middleware function
            curr.handler(req, res, next, later);
          } else if (utils.isObject(curr.handler) && utils.isFunc(curr.handler.emit)) {
            // Server object
            curr.handler.emit('request', req, res, next, later);
          } else {
            // There are something wrong about the middleware
            next();
          }
        } catch(err) {
          next();
        }
      } else {
        next();
      }
    } else {
      // Out to next step of the pipeline
      out();
    }
  }
  // if the middleware depend on other middlewares,
  // it can let it later to run
  function _later(curr) {
    var i = middlewares.indexOf(curr);
    var _tmp1 = middlewares.slice(0, i);
    _tmp1.push(middlewares[i + 1], curr);
    var _tmp2 = middlewares.slice(i + 2);
    [].push.apply(_tmp1, _tmp2);
    middlewares = _tmp1;
  }
  // first middleware
  next();
  return this;
};

While this code may seem extremely complex, it will be much clearer if we simplify it.


server.runMiddlewares = function(url, req, res, out) {
  var index = -1;
  var middlewares = this._usingMiddlewares;
  // run the next middleware if it is exists
  function next(err) {
    index++;
    // current middleware
    var curr = middlewares[index];
    if (curr) {
      var check = new RegExp(curr.route);
      // Check the route
      if (check.test(url)) {
          // run the current middleware
          curr.handler(req, res, next);
      } else {
        next();
      }
    } else {
      // Out to next step of the pipeline
      out();
    }
  }
  // first middleware
  next();
  return this;
};

Recursion can be used in the implementation of middleware systems, because recursion is the most suitable for asynchronous I /o process response in node.js.

In this middleware implementation code, this._usingmiddlewares is a loop array, function next() is the body of the loop, where check.test(url) is the judgment condition, and the loop handling code is the first index counter in the body of the loop self-increment 1 and the recursive call of the next function itself.


Related articles: