dont use C-asm loops and unroll once float_to_int16_3dnow()