Skip to content Skip to sidebar Skip to footer

How Do Libraries/programming Languages Convert Floats To Strings

This is a mystery that I was trying to figure out when I was 15, but I failed. I still don't know the answer. Here's a naive and flawed solution (like some other failed attempts I

Solution 1:

I am not a JAVASCRIPT coder so I stick with C++ ...

Converting number to string in decadic base is more complicated then using binary or its powers bases (bin,oct,hex) due to fact that all the numbers on usual computer are stored in binary not in decadic. Also it is not the same if you converting integer or fractional part. Let assume we have number x and want string s encoded in ASCII so this is how basic conversion work:

  1. Handle sign

    s="+";
    if (x<0.0)  { x=-x; s="-"; }
    

    as you can see its easy. Some number formats have a separate sign bit (usually the msb one) so in such case the code can be converted to bit operations for example 32 bit float:

    DWORD* dw=(DWORD*)(&x); // allow bit manipulation
    s="+";
    s[0]+=(((*dw)>>30)&2);  // ASCII +,- codes are 2 apart
    (*dw)&=0x7FFFFFFF;      //x=abs(x)
    

    so we have extracted sign character for our string and make x unsigned.

  2. Handle Integer part of x

    integer is converted to string by dividing it by the printing base so:

    y=floor(x); // integer partif (y)
     for (;y;) // until number is nonzero
     {
     s+='0'+(y%10); // works only for up to 10 base
     y/=10;
     }
    else s+='0'; // handle y=0 separately 

    so the remainder of each division is the wanted digit of the string but in reverse order. So after the conversion reverse the digits in string by a single for loop or you can store the number digits in reverse order directly. But for tat you need to know the number of digits of the integer part of the number. That is done by

    digits = ceil(log(y)/log(base)) + 1

    so for decadic:

    digits = ceil(log10(y)) + 1
  3. handle fractional part of x

    this is converted by multiplying by the conversion base.

    z=x-floor(x); // fractional partif (z)
     for (s+='.';z;) // until number is nonzero here you can limit to number of digits
     {
     z*=10.0;
     s+='0'+int(floor(z)); // works only for up to 10 base
     z-=floor(z);
     }
    

    this is returning the digits in their order so no reversing this time...

I encoded all the code directly in the SO editor so there might be hidden syntax errors.

Now usual print functions have also formatting which adds zero or space padding or cut off fractional digits above some value etc ...

If you have a bignum x then this will be much much slower because you can not handle basic +,-,*,/ operations as O(1) anymore and its usually faster to create hex string and convert the string to decadic on 8bit arithmetics instead or use biggest power of 10 that fits into used DATA WORD the bignum is stored with. The hex -> dec conversion can be done like this:

but again for very big strings it will be slow. In such case it can be speed up by using FFT/NTT approaches similar to Schönhage-Strassen multiplication but I never tried to use it for printing before so I lack any insights on such approach.

Also beware that determining the number of digits of a value is not regular for fractional part of number (see the link above) so you need to mind that you can be off by 1-2 digits.

[Edit1] rounding the string

simply if you detect n consequent zeros or nines in the fractional part (after any nonzero digit) you need to stop printing and round. Zeros are just cut of and nines you need to cut off too and increment the rest by one in the string. Such operation might overflow to 1 digit not present in the string so in such case just insert 1.

When I put all together I come up with this C++/VCL code (based on VCLAnsiString data type):

AnsiString print(double x){
    char c;
    int i,j;
    double y,a;
    AnsiString s;

    constint B=10;                 // chose base 2...16constdouble b=B;               // baseconstdouble _b=1.0/b;          // 1/baseconstchar digit[16]="0123456789ABCDEF";

    #define _enable_rounding#ifdef _enable_roundingconstint round_digits=5;       // min consequent 0s or B-1s to triger roundingint cnt0=0,cnt1=0;              // consequent digit countersint ena=0;                      // enabled consequent digit counters? after first nonzero digit#endif// here you should handle NaN and Inf cases// handle sign
    s="+";
    if (x<0.0)  { x=-x; s="-"; }
    // integer part
    y=floor(x);
    if (y) for (;y>0.0;)        // until number is nonzero
        {
        a=y; y=floor(y*_b);     // the same as y/=10 on integers
        a-=y*b;                 // the same as a=y%10 on integers
        i=int(a);
        s+=digit[i];
        #ifdef _enable_rounding
        ena|=i;
        #endif
        }
    else s+='0';                // handle y=0 separately// reverse string skipping +/- sign (beware AnsiString is indexed from 1 up to its length included!!!)for (i=2,j=s.Length();i<j;i++,j--){ c=s[i]; s[i]=s[j]; s[j]=c; }
    // fractional part
    y=x-floor(x);
    if (y) for (s+='.';y>0.0;)      // until number is nonzero here you can limit to number of digits
        {
        y*=b;
        a=floor(y);
        y-=a;
        i=int(a);
        s+=digit[i];
        #ifdef _enable_rounding
        ena|=i;
        // detect consequent rounding digitsif (ena)
            {
                 if (i==  0){ cnt0++; cnt1=0; }
            elseif (i==B-1){ cnt1++; cnt0=0; }
            else            { cnt0=0; cnt1=0; }
            }
        // round down .???00000000 by cut of zerosif (cnt0>=round_digits)
            {
            s=s.SubString(1,s.Length()-cnt0);       // by cut of zerosbreak;
            }
        // round up .???999999999 by increment and cut of zeros (only base 10) !!!if (cnt1>=round_digits)
            {
            s=s.SubString(1,s.Length()-cnt1);       // cut off ninesfor (j=1,i=s.Length();(i>=2)&&(j);i--)
                {
                c=s[i];
                if (c=='.') continue;
                if (c=='9'){ s[i]='0'; continue; }
                j=0; s[i]++;
                }
            if (j) s=s.Insert("1",i+1); // overflow -> insert "1" after signif (s[s.Length()]=='.')     // cut off decimal point if no fractional part left
             s=s.SubString(1,s.Length()-1);
            break;
            }
        #endif
        }
    return s;
    }

You can select base B=<2,16>. You can enable disable the rounding by using/commenting the #define _enable_rounding. Beware the rounding routine works only for base 10 as for different bases the increment routine would have a bit different code/constants and too lazy to do it universally (it would be longer and less comprehend-able code). The round_digits constant is a threshold of how many consequent zeros or nines are triggering the rounding.

Solution 2:

(I don't have enough reputation to comment, so am resorting to using an answer...)

I notice that you ran the precision out to 300+ digits, well beyond the precision of the floating point numbers, hence the imprecise result. If you're looking for a means to have high precision calculations, you can probably resort to BigInt, and scale up the numbers accordingly. (I say "probably", because BigInt can be coerced into fixed precision calculations, not floating point, and thus depending on one's goal, BigInt might not meet the requirement.)

Eg, calculating 1000 / 17 to 100 significant digits can be handled via the following function, which essentially scales up 1000 and 17 to ensure 100 significant digits. (Note that this is just a concept function for handling high precision division between two integers, but can be the basis for non-integers by scaling up the dividend and divisor until they're integers, and adjusting digits accordingly. Plus you might need to fudge in some extra "hidden" digits of precision to handle rounding)...

functiondivideN(dividend, divisor, digits) {
  dividend = dividend * 10n ** (BigInt(digits) * 2n);
  divisor = divisor * 10n ** BigInt(digits);
  var s = (dividend/divisor).toString();
  if (s.length < digits) {
    s = "0".repeat(digits - s.length) + s;
  }
  s = s.slice(0, s.length - digits) + "." + s.slice(-digits);
  return s;
}

BigInt requires numbers to end in "n", so the function needs to be called as follows...

divideN(1000n,17n,100)

...which in this case returns...

"58.8235294117647058823529411764705882352941176470588235294117647058823529411764705882352941176470588235"

Note in this case, 102 digits of precision are returned rather than 100 because of the relative size of the dividend (1000) to the divisor (17).

Post a Comment for "How Do Libraries/programming Languages Convert Floats To Strings"