Sunday, December 10, 2006

 

double to float conversion

When we must convert float number to integer in C, we have two convenient functions at our disposal: floor and ceil. For some reason however, there are no natural counterparts to these functions when we deal with double -> float conversion.

I wrote a simple implementation of these utilities using existing C library functions frexp and ldexp:

float fdround ( double x, int isfloor )
{
  const int    Nf = 23; /* IEEE 754: http://en.wikipedia.org/wiki/IEEE_754 */
  double    m;
  int        exp, sx;
  if (0 == (sx = ((x < 0.0) ? (-1) : ((x > 0.0) ? 1 : 0)))) return 0.0;
  m = frexp ( x * sx, &exp );
  return (float)(ldexp ( (double)sx * ((isfloor ^ (sx == -1))? floor : ceil)
           (m * (1 << (Nf + 1))), exp - Nf - 1));  
}  
#define fceil(x) (fdround ( (x), 0 ))  
#define ffloor(x) (fdround ( (x), 1 ))  

This implementation of course depends on the correct knowledge of the number of bits for mantissa as per IEEE standard, which is 23 for 32-bit floating-point numbers.

Here is a simple utility to test whether the above function generates correct numbers:

void test(double x)
{
  float f = ffloor(x), c = fceil(x);
  float a = 0.4 * f + 0.6 * c, b = 0.6 * f + 0.4 * c, y = (f + c)/2;
  assert ( f <= x && c >= x );
  assert ( a == c && b == f );
  assert ( y == f || y == c );
}

Enjoy!

Labels:


Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?