Analysis of NodeJS Module Buffer Principle and Usage

  • 2021-09-16 06:19:20
  • OfStack

As an important concept and function in nodejs, Buffer provides developers with the ability to operate binary. This article documents several questions to deepen our understanding and use of Buffer:

Cognitive buffer How to Request Out-of-Heap Memory How to Calculate Byte Length How to Calculate Byte Length How to convert character encoding Understanding shared memory and copied memory

Understanding Buffer (buffer)

Buffer is the core of nodejs, API, which provides us with the ability to process binary data streams. The use of Buffer is very similar to that of Uint8Array of ES2017, but due to the characteristics of node, a more in-depth api is specially provided.

Uint8Array literally means an 8-bit unsigned integer array. One byte is 8bit, and the byte representation is also composed of two digits in hexadecimal (4bit).

const buf = Buffer.alloc(1);
console.log(buf); // output: < Buffer 00 >

How to Request Out-of-Heap Memory

Buffer can jump out of nodejs's limit on heap memory size. nodejs12 provides four types of api to request out-of-heap memory:

Buffer.from() Buffer.alloc(size[, fill[, encoding]]) Buffer.allocUnsafe(size) Buffer.allocUnsafeSlow(size)

Buffer.alloc vs Buffer.allocUnsafe

When applying for memory, it is possible that this memory has previously stored other data. If the original data is not cleared, there will be a security risk of data leakage; If the original data is cleared, the speed will be 1. The specific method is determined according to the actual situation.

Buffer. alloc: Request the specified size of memory and clear the original data, populating 0 by default Buffer. allocUnsafe: Request a specified size of memory, but do not clear the original data, which is faster

According to the provided api, one alloc can be implemented manually:


function pollifyAlloc(size, fill = 0, encoding = "utf8") {
  const buf = Buffer.allocUnsafe(size);
  buf.fill(fill, 0, size, encoding);
  return buf;
}

Buffer.allocUnsafe vs Buffer.allocUnsafeSlow

The effect can be seen directly from the naming, and Buffer. allocUnsafeSlow is slower. Because when creating a new Buffer instance using Buffer. allocUnsafe, if the memory to be allocated is less than 4KB, it will be cut out from 1 pre-allocated Buffer. This prevents the garbage collection mechanism from being overused by creating too many independent Buffer.

This approach improves performance and memory usage by eliminating the need for tracing and cleaning.

How to Calculate Byte Length

With Buffer, the true bytes occupied by data can be obtained. For example, a Chinese character has a character length of 1. However, because it is a Chinese character encoded by utf8, it takes up 3 bytes.

Using Buffer. byteLength () directly, you can obtain the byte length of the specified encoding of the string:

const str = "Original address of this article: xxoo521.com";

console.log(Buffer.byteLength(str, "utf8")); // output: 31
console.log(str.length); // output: 19

You can also directly access the length property of an Buffer instance (not recommended):

console.log(Buffer.from(str, "utf8").length); // output: 31

How to convert character encoding

The encoding formats currently supported by Nodejs are: ascii, utf8, utf16le, ucs2, base64, latin1, binary and hex. Other coding needs to be completed by means of 3-party library.

Here are the transcoding functions for the nodejs platform encapsulated in Buffer. from () and buf. toString ():


function trans(str, from = "utf8", to = "utf8") {
  const buf = Buffer.from(str, from);
  return buf.toString(to);
}

// output: 5Y6f5paH5Zyw5Z2AOiB4eG9vNTIxLmNvbQ==
console.log(trans(" Original address : xxoo521.com", "utf8", "base64"));

Shared memory and copy memory

When generating an Buffer instance and manipulating binary data, it is important to pay attention to whether the interface is based on shared memory or copied underlying memory.

For example, for from (), which generates an instance of Buffer, the underlying behavior of nodejs is different for different types of parameters.

For a more vivid explanation, please look at the following two pieces of code.

Code 1:

const buf1 = Buffer.from("buffer");
const buf2 = Buffer. from (buf1); //Copy the data of buffer in the parameter to the new instance
buf1[0]++;

console.log(buf1.toString()); // output: cuffer
console.log(buf2.toString()); // output: buffer

Code 2:

const arr = new Uint8Array(1);
arr[0] = 97;

const buf1 = Buffer.from(arr.buffer);
console.log(buf1.toString()); // output: a

arr[0] = 98;
console.log(buf1.toString()); // output: b

In the second code, the parameter type passed in Buffer. from is arrayBuffer. So Buffer. from is simply creating the view, not copying the underlying memory. The memory of buf1 and arr is shared.

In the process of operating Buffer, special attention should be paid to the difference between sharing and copying, and it is difficult to check errors.


Related articles: