Deeply understand the processing problem of large number and high precision number

  • 2020-04-01 23:40:55
  • OfStack

Float and double are single and double precision Numbers, with values of 3.4E+10 to 3.4E+10 to the 38th power, and 1.7E+10 to the 308th power.

So with only 6-7 significant digits, how can a float hold something as big as 3.4 times 10 to the minus 38? Likewise, a 15-16 bit double can't hold 1.7 times 10 to the minus 308. Right?

Answer: float 6-7 bits refers to the number of digits (precision) of the significant number, not the numerical size. For example, 3.14159267 has nine significant digits with values between 3 and 4, while 350 has three significant digits with values between 300 and 400. So float can be 3.4E+10, but it has only 6-7 significant digits, so if 3.14159267 is assigned to a float, the precision will be lost. For example,

float a=3234567.1;
float b=3234567;
if( a==b )
    printf("YES");
else
    printf("NO");

Will output YES because the 11 at the end of a exceeds the accuracy of float to only 6-7 bits. (if a=1234567.1; B =3234567) the input will be NO. Why? This requires us to analyze: how to deal with the part beyond the accuracy? It's not rounding, it's missing bits. So sometimes you get 6 bits of precision, sometimes you get 7 bits of precision, depending on the binary representation of the number.

So how do we want to represent Numbers that are very long, very precise?
Such as 123456789123456789123456789 (long 30 large integer);
Such as 3.14159012345678901234567890123 (30 ultra-high precision decimal), so long, long, float have not come down, this is the use of "string" or "the character array".

Unsigned __int64 n;
With the maximum value of 1234567892345678912 (20 digits), it can reach about 1.8E+19, which should be enough for normal use.

However, the data of type s of s _int64 cannot be output by cout in C++. Cout should not be overloaded. If printf is used to output, obviously %d, %f and %l cannot meet the 20-bit accuracy. However, the number of supported digits is no more than 20, and in my tests, the output of more than 9.23E+18 will go wrong. The best way is to convert the long digit to a string, as follows:

char buffer[65];
printf("%s", _ui64toa(n, buffer,10) );

The function _ui64toa is responsible for converting n toa string, which is stored in the character array buffer[65].

Number into a string, the reference program is as follows:

#include <stdlib.h>
#include <stdio.h>
int main( void )
{
   char buffer[65];
   int r;
   for( r=10; r>=2; --r )
   {
     _itoa( -1, buffer, r );
     printf( "base %d: %s (%d chars)n", r, buffer, strlen(buffer) );
   }
   printf( "n" );
   for( r=10; r>=2; --r )
   {
     _i64toa( -1L, buffer, r );
     printf( "base %d: %s (%d chars)n", r, buffer, strlen(buffer) );
   }
   printf( "n" );
   for( r=10; r>=2; --r )
   {
     _ui64toa( 0xffffffffffffffffL, buffer, r );
     printf( "base %d: %s (%d chars)n", r, buffer, strlen(buffer) );
   }
}



Output
base 10: -1 (2 chars)
base 9: 12068657453 (11 chars)
base 8: 37777777777 (11 chars)
base 7: 211301422353 (12 chars)
base 6: 1550104015503 (13 chars)
base 5: 32244002423140 (14 chars)
base 4: 3333333333333333 (16 chars)
base 3: 102002022201221111210 (21 chars)
base 2: 11111111111111111111111111111111 (32 chars)

base 10: -1 (2 chars)
base 9: 145808576354216723756 (21 chars)
base 8: 1777777777777777777777 (22 chars)
base 7: 45012021522523134134601 (23 chars)
base 6: 3520522010102100444244423 (25 chars)
base 5: 2214220303114400424121122430 (28 chars)
base 4: 33333333333333333333333333333333 (32 chars)
base 3: 11112220022122120101211020120210210211220 (41 chars)
base 2: 1111111111111111111111111111111111111111111111111111111111111111 (64 chars)

base 10: 18446744073709551615 (20 chars)
base 9: 145808576354216723756 (21 chars)
base 8: 1777777777777777777777 (22 chars)
base 7: 45012021522523134134601 (23 chars)
base 6: 3520522010102100444244423 (25 chars)
base 5: 2214220303114400424121122430 (28 chars)
base 4: 33333333333333333333333333333333 (32 chars)
base 3: 11112220022122120101211020120210210211220 (41 chars)
base 2: 1111111111111111111111111111111111111111111111111111111111111111 (64 chars)

PS: this function can be used to convert decimal integers into binary strings;

int main( void )
{
   char buffer[65];
   _itoa( 12, buffer, 2 );
   printf( "base %d: %s (%d chars)n", r, buffer, strlen(buffer) );
}

Another way to do this is to define your own character array to hold the number of super-long digits, which the decimal can also solve, and then define the algorithm between the super-long Numbers in the form of these strings and overload the operators, which is said to be pretty efficient.

Related articles: