Learn the floating point precision of javascript from me

  • 2020-11-03 22:01:08
  • OfStack

Most programming languages have several numeric data types, but JavaScript has only one. You can use the typeof operator to see the type of number. Both integers and floats, JavaScript categorizes them simply as Numbers.

typeof 17; //number
typeof 98.6; //number
typeof -21.3; //number

In fact, all the Numbers in JavaScript are double-precision floating point Numbers. This is the 64-bit encoding number developed by the IEEE754 standard -- "doubles". If this fact makes you wonder how JavaScript represents integers, remember that double-precision floating-point Numbers perfectly represent integers up to 53 bits. All integers from 9 007 199 254 740 992 (253) to 9 007 199 254 992 (253) are valid double-precision floating-point Numbers. So, despite the lack of an obvious integer type in JavaScript, integer arithmetic is perfectly acceptable.
Most arithmetic operators can be evaluated using integers, real Numbers, or a combination of both.

0.1 * 0.9; //0.19
-99 + 100; //1
21- 12.3; //8.7
2.5 /5; //0.5
21%8; //5

However, bit arithmetic operators are special. JavaScript does not operate directly on operands as floating point Numbers, but implicitly converts them to 32-bit integers. (Specifically, they are converted to integers represented by the complement of 2 of the 32-bit large end (ES19en-ES20en).) Take bitwise or operational expressions as an example:

8|1; //9

A seemingly simple expression actually requires several steps to complete the operation. As mentioned earlier, the Numbers 8 and 1 in JavaScript are double-precision floating point Numbers. But they can also be represented as 32-bit integers, which are sequences of 32-bit zeros and ones. The integer 8 is represented as the 32-bit base 2 sequence as follows:


You can also use the number type toString method to see for yourself:

(8).toString(2) //"1000"

The arguments to the toString method specify its conversion cardinality, which is represented in radix 2 (that is, base 2) in this example. The resulting value omits the extra 0 (bits) at the left end because they do not affect the final value.
The integer 1 is represented as 32-bit base 2 as follows:


Merges two sequences of bits by bit or operation expression. As long as any one bit of the two bits involved in the operation is 1, that bit of the operation result is 1. The results expressed in bit mode are as follows:


This sequence represents the integer 9. You can use the standard library function parseInt to verify, again with base 2:

parseInt("1000", 2); //9

(Again, the leading 0 bits are not necessary because they do not affect the result of the operation.)
All bit operators work the same way. They convert operands to integers, then perform operations using integer bit mode, and finally convert the result to a standard JavaScript floating-point number. In general, the JavaScript engine needs to do some extra work to make these conversions. Because Numbers are stored as floats, they must be converted to integers and then back to floats. However, in some cases, arithmetic expressions or even variables can only be evaluated with integers, and the optimizer can sometimes infer these cases by storing Numbers internally as integers to avoid redundant transformations.

The final caveat with floating-point Numbers is that you should always be on your guard against them. Floating point Numbers may seem familiar, but they are notoriously imprecise. Even the simplest arithmetic operations produce incorrect results.

0.1+0.2; 0.300000000000004

Although 64-bit precision is already quite high, double-precision floating-point Numbers represent only a finite set of Numbers, not all sets of real Numbers. Floating-point arithmetic produces only approximate results, with 4 rounded to the nearest representable real Numbers. As you perform a series 1 operation, the results become less and less accurate as the rounding errors accumulate. Rounding can also cause a number of unexpected deviations from the normally expected laws of arithmetic operations. For example, real Numbers satisfy the associative law, which means that for any real number x, y, z, it always satisfies (x + y) + z = x + (y + z).

However, this is not always the case with floating-point Numbers.

 ( 0.1+0.2 ) +0.3; //0.60000000000000001
0.1+(0.2+ 0.3); //0.6

Floating point Numbers weigh precision against performance. When we are concerned about precision, we should be careful about the limitations of floating point Numbers. An effective solution is to use integer values as much as possible, since integers do not need to be rounded in their representation. When doing a currency correlation calculation, programmers usually convert the value proportionately to the smallest monetary unit for representation before doing the calculation, so that the calculation can be done as an integer. For example, if the above calculation is in dollars, then we can convert it to an integer representing cents for the calculation.

(10+20)+30; //60
10+ (20+30); //60

For integer operations, you don't have to worry about rounding errors, but you should be careful that all calculations only apply to integers with a value of 253 ~ 253.


The JavaScript Numbers are double-precision floating point Numbers. The integer in JavaScript is just a subset of the double floating-point number, not a single data type The bit operator treats a number as a 32-bit signed integer.

Above is the introduction of javascript floating point Numbers, we should always pay attention to the precision trap in floating point operations, I hope this article is helpful for you to learn.

Related articles: