locked
Double: keep only k significant figures RRS feed

  • Question

  • Hi,

    I would like to round a Double to a given significant figures (ex: 50 bits).

    All functions that I have see, round or trunc to a given number of decimals, which is different: I want to round the same way very big and very small numbers.

    I wrote a piece of code that does almost what I need, but not exactly:

    public static double Clear2LessSignificantBits(double value){

                ulong l = (ulong)BitConverter.DoubleToInt64Bits(value);
                l = l & 0xfffffffffffffffc;     // c is 1100 so it clear the 2 last bits
                return BitConverter.Int64BitsToDouble((long)l);

    }

    1. Is there a standard and efficient way of doing it?

    2. If not, does my function respect the standard [1]?

    3. To round, do I just have to read the value of the left most bit that I clear?

    Note: converting to float and to double back does pretty much what I want, but it removes too many significant bits (I would like to remove only a few of them).

    Best Regards,

    [1] It has been tested with NaN, PositiveInfinity and NegativeInfinity. Also, I have read http://www.extremeoptimization.com/resources/Articles/FPDotNetConceptsAndFormats.aspx and http://docs.sun.com/source/806-3568/ncg_goldberg.html, but I am not sure I respect all rules.

     

     

    Wednesday, February 14, 2007 10:51 AM

Answers

  • As you know I've done this kind of  work in C++ here:

    http://metasharp.net/index.php?title=How_to_compare_double_or_float_in_Cpp

    We came up with approaches a bit different. I wanted to do the same I did for C++ in C# but couldn't come up with a way to do it in safe code.

    Basically you do nearly the same I did. I removed 4 bits (to remove the least i could but at least the last decimal). You remove 2 bits only. I wonder if it's enough...

    Wednesday, February 14, 2007 2:43 PM

All replies

  • As you know I've done this kind of  work in C++ here:

    http://metasharp.net/index.php?title=How_to_compare_double_or_float_in_Cpp

    We came up with approaches a bit different. I wanted to do the same I did for C++ in C# but couldn't come up with a way to do it in safe code.

    Basically you do nearly the same I did. I removed 4 bits (to remove the least i could but at least the last decimal). You remove 2 bits only. I wonder if it's enough...

    Wednesday, February 14, 2007 2:43 PM
  • Thanks, but it does  not answer my 3 questions:

    1. Is there a standard and efficient way of doing it?

    2. If not, does my function respect the standard ?

    3. To round, do I just have to read the value of the left most bit that I clear?

    Niels

    Thursday, February 15, 2007 12:11 PM
  •  nielsvanvliet wrote:

    Thanks, but it does  not answer my 3 questions:

    1. Is there a standard and efficient way of doing it?

    2. If not, does my function respect the standard ?

    3. To round, do I just have to read the value of the left most bit that I clear?

    Niels

    1. I don't think so.

    2. There is no standard up to my knowledge.

    3. I'm not sure. If you look at my implementation, you'll see that the bytes in little endian are kind of "not in the logical" order. I would rather do:

    l = l & 0xfffcffffffffffff;

    Thursday, February 15, 2007 6:45 PM