Answered VC11: Auto-Vectorizer and concurrency::graphics

  • Sunday, April 29, 2012 10:37 PM
     
      Has Code

    Hi,

    The following functions do not seem to auto-vectorize (even under /fp:fast) on an x64 target:

    #include "stdafx.h"
    
    #include <amp_graphics.h>
    
    using namespace concurrency::graphics;
    
    float_4 myfun(float_4 a, float_4 b)
    {
    	return a + b;
    }
    
    float_4 myfunref(const float_4& a, const float_4& b)
    {
    	return a + b;
    }

    The generated code is basically a sequence of movss and addss, instead of movups and addps instructions.

    Should auto-vectorization work in that case ? Is auto-vectorization a loop-only feature ? Am I missing something ?

    Best regards, Arnaud.

All Replies

  • Monday, April 30, 2012 4:04 PM
     
     Answered

    Hi Arnaud,

    Auto-vectorization works only on loops.  That's to say, it looks at operations that would execute in different iterations of a loop, and joins them together (safely) into a single operation.

    It is possible to vectorize your examples too (called "statement vectorization"), but the auto-vectorizer doesn't tackle that problem (in this release).

    Thanks,

    Jim


    Jim