Node. js method for file copying and directory traversal of local file operations

  • 2020-12-21 17:57:41
  • OfStack

File copy
NodeJS provides basic file manipulation API, but advanced functions like file copying are not available, so let's get into the file-copying process first. Similar to the copy command, our program needs to be able to take both the source and target file paths.

Small file copy
We use the fs module built into NodeJS to simply implement the program as follows.


var fs = require('fs');

function copy(src, dst) {
  fs.writeFileSync(dst, fs.readFileSync(src));
}

function main(argv) {
  copy(argv[0], argv[1]);
}

main(process.argv.slice(2));

The above program uses fs.readFileSync to read the file contents from the source path and fs.writeFileSync to write the file contents to the target path.

Knowledge of beans: process is a global variable that can be used to obtain command-line arguments via ES20en.argv. Since argv[0] is fixed equal to the absolute path of the NodeJS executor and argv[1] is fixed equal to the absolute path of the main module, the first command line argument starts at the position of argv[2].

Large file copy
There is nothing wrong with the above program copying 1 small files, but this method of once reading everything into memory and then writing it to disk again is not suitable for copying large files, memory will burst. For large files, we can only read and write 1 point until the copy is complete. So the above program needs to be modified as follows.


var fs = require('fs');

function copy(src, dst) {
  fs.createReadStream(src).pipe(fs.createWriteStream(dst));
}

function main(argv) {
  copy(argv[0], argv[1]);
}

main(process.argv.slice(2));

The above program used ES32en. createReadStream to create a read-only data stream for 1 source file and fs. createWriteStream to create a write-only data stream for 1 target file, and connected the two data streams using the pipe method. What happens when you connect them together, to put it more abstractly, is that the water flows down the pipe from one bucket to another.

Directory traversal

Traversal of directories is a common requirement when manipulating files. For example, if you write a program that needs to find and process all JS files in a specified directory, you need to traverse the entire directory.

A recursive algorithm
Use recursion when traversing directories, otherwise it is difficult to write clean code. Recursive algorithms are similar to mathematical induction in that they solve problems by decreasing their size. The following example illustrates this approach.


function factorial(n) {
  if (n === 1) {
    return 1;
  } else {
    return n * factorial(n - 1);
  }
}

The above function is used to calculate the factorial of N (N!) . As you can see, when N is greater than 1, the problem is reduced to calculating N times es52EN-1 factorial. When N is equal to 1, the problem reaches its minimum size and does not need to be simplified any further, so 1 is returned directly.

Pitfall: The code written using recursive algorithms is concise, but because each recursive call results in one function call, the recursive algorithm needs to be converted to a circular algorithm to reduce the number of function calls when performance is a priority.

Through the calendar calculation method
The directory is a tree structure, and depth-first + sequential traversal algorithm is used in traversal 1. Depth-first means that once a node is reached, the child nodes are first traversed instead of the neighbor nodes. Sequential traversal means that the first time a node is reached, the traversal is complete, rather than the last time a node is returned. So when you use this traversal, the order of traversal for this tree is A > B > D > E > C > F.


     A
     / \
    B  C
    / \  \
   D  E  F

Synchronization traverse
Knowing the necessary algorithms, we can simply implement the following directory traversal function.


function travel(dir, callback) {
  fs.readdirSync(dir).forEach(function (file) {
    var pathname = path.join(dir, file);

    if (fs.statSync(pathname).isDirectory()) {
      travel(pathname, callback);
    } else {
      callback(pathname);
    }
  });
}

As you can see, the function takes a directory as a starting point for the traversal. When you encounter a subdirectory, you first traverse the subdirectory. When a file is encountered, the absolute path of the file is passed to the callback function. Once the callback function gets the file path, it can do all kinds of judgment and processing. Therefore, assume the following directories:


- /home/user/
  - foo/
    x.js
  - bar/
    y.js
  z.css

When you walk through the directory using the following code, you get the following input.


travel('/home/user', function (pathname) {
  console.log(pathname);
});


/home/user/foo/x.js
/home/user/bar/y.js
/home/user/z.css

Asynchronous traversal
If you are reading directories or reading file state using asynchronous API, the directory traversal function is a little more complicated to implement, but the principle is the same. The asynchronous version of the travel function is shown below.


function travel(dir, callback, finish) {
  fs.readdir(dir, function (err, files) {
    (function next(i) {
      if (i < files.length) {
        var pathname = path.join(dir, files[i]);

        fs.stat(pathname, function (err, stats) {
          if (stats.isDirectory()) {
            travel(pathname, callback, function () {
              next(i + 1);
            });
          } else {
            callback(pathname, function () {
              next(i + 1);
            });
          }
        });
      } else {
        finish && finish();
      }
    }(0));
  });
}

The techniques for writing asynchronous traversal functions will not be covered here, but will be covered in more detail in the following sections. In summary, we can see that asynchronous programming is quite complicated.


Related articles: