locked
Compiling Win RT Component on ARM -- Floating point issues

    Question

  • I'm using several open source libraries that I've packaged into a C/C+ WinRT component that compiles successfully under x86 (passes WACK, already submitted to the Windows Store).

    I'm now trying to compile the app for ARM, and I'm running into some issues, specifically around floating point stuff:

    OpenJPEG uses the following inline and uses the "fistp" assembly instruction to perform calculations-- this compiles fine on x86 but doesn't compile on ARM.  I've tried adding the "/QRfpe" compiler option but that doesn't solve it either. 

    Any ideas?

    /* MSVC and Borland C do not have lrintf */
    #if defined(_MSC_VER) || defined(__BORLANDC__)
    static INLINE long lrintf(float f){
    #ifdef _M_X64
        return (long)((f>0.0f) ? (f + 0.5f):(f -0.5f));
    #else
        int i;
     
        _asm{
            fld f
            fistp i
        };
     
        return i;
    #endif
    }
    #endif

    Saturday, October 20, 2012 12:35 AM

Answers

  • You cannot use x86 assembly instructions on an ARM processor. Include ARM in your #ifdef _M_X64 code instead of using the x86 specific _asm block.

    --Rob

    • Marked as answer by Jesse Jiang Thursday, October 25, 2012 3:07 AM
    Saturday, October 20, 2012 2:43 AM
    Owner
  • If you are just doing a quick port, get rid of the assembly entirely and use this for all architectures

    return (long)((f>0.0f) ? (f + 0.5f):(f -0.5f));
    

    If you are really looking at optimizations, you should take advantage of the fact that all Windows 8 systems requires SSE/SSE2 support and that all Windows RT systems require ARM-NEON support and use intrinsics rather than inline assembly to implement it.

    For example, on Windows 8 x86 and Windows 8 x64, you can use

    return _mm_cvt_ss2si(_mm_load_ss(&x));

    Or better yet arrange it so you can do 4 integers at the same time.

    For Windows RT on ARM, vcvt_f32_u32 can be useful, but it will round to zero rather than round to nearest.

    • Marked as answer by Ch3rryC0ke Thursday, October 25, 2012 7:09 PM
    Thursday, October 25, 2012 6:28 PM

All replies

  • You cannot use x86 assembly instructions on an ARM processor. Include ARM in your #ifdef _M_X64 code instead of using the x86 specific _asm block.

    --Rob

    • Marked as answer by Jesse Jiang Thursday, October 25, 2012 3:07 AM
    Saturday, October 20, 2012 2:43 AM
    Owner
  • If you are just doing a quick port, get rid of the assembly entirely and use this for all architectures

    return (long)((f>0.0f) ? (f + 0.5f):(f -0.5f));
    

    If you are really looking at optimizations, you should take advantage of the fact that all Windows 8 systems requires SSE/SSE2 support and that all Windows RT systems require ARM-NEON support and use intrinsics rather than inline assembly to implement it.

    For example, on Windows 8 x86 and Windows 8 x64, you can use

    return _mm_cvt_ss2si(_mm_load_ss(&x));

    Or better yet arrange it so you can do 4 integers at the same time.

    For Windows RT on ARM, vcvt_f32_u32 can be useful, but it will round to zero rather than round to nearest.

    • Marked as answer by Ch3rryC0ke Thursday, October 25, 2012 7:09 PM
    Thursday, October 25, 2012 6:28 PM
  • If you are just doing a quick port, get rid of the assembly entirely and use this for all architectures

    return (long)((f>0.0f) ? (f + 0.5f):(f -0.5f));

    If you are really looking at optimizations, you should take advantage of the fact that all Windows 8 systems requires SSE/SSE2 support and that all Windows RT systems require ARM-NEON support and use intrinsics rather than inline assembly to implement it.

    For example, on Windows 8 x86 and Windows 8 x64, you can use

    return _mm_cvt_ss2si(_mm_load_ss(&x));

    Or better yet arrange it so you can do 4 integers at the same time.

    For Windows RT on ARM, vcvt_f32_u32 can be useful, but it will round to zero rather than round to nearest.

    Very useful, thanks !
    Thursday, October 25, 2012 7:09 PM
  • One minor note I forgot to mention. What exactly are you using OpenJPEG to do in your Windows Store app? Using Windows Imaging Component (WIC) to load and save JPG files would be faster, better optimized for Windows, and less code you have to include/test.
    Thursday, February 28, 2013 10:03 PM