none
Implicit Memory Barriers RRS feed

  • Question

  • let's say i have variables A, B and C that two threads (T1, T2) share.
    i have the following code:

    //T1 
    //~~

    A = 1; 
    B = 1; 
    C = 1;

    InterlockedExchange(ref Foo, 1);

    //T2  (executes AFTER T1 calls InterlockedExchange) 
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    InterlockedExchange(ref Bar, 1);

    WriteLine(A); 
    WriteLine(B); 
    WriteLine(C);

    Question:
    does calling InterlockedExchange (implicit full fence) on T1 and T2, gurentess that T2 will "See" the write done by T1 before the fence? (A, B and C variables), even though those variables are not plance on the same cache-line as Foo and Bar ?

    Thursday, May 13, 2010 6:25 AM

Answers

  • Yes -- the full fence generated by the InterlockedExchange will guarantee that the writes to A, B, and C are not reordered past the fence implicit in the InterlockedExchange call.  This is the point of memory barrier semantics.  They do not need to be on the same cache line.  Were that not true, think of what would happen if you were to write code similar to the following (and how the lock might be implemented): 

    //
    // pStruc has A = B = C = 0
    //
    some_lock.Acquire();
    pStruc->A = 1;
    pStruc->B = 2;
    pStruc->C = 3;
    some_lock.Release();

    The lock release above includes the appropriate memory barriers (acquire/lfence on the acquire and release/sfence on the release) to guarantee that if another thread observes the lock as released after the above code executes, it also observes A = 1, B = 2, C = 3.  These objects likely are not on the same cache line.  The fence implicitly supplied by the some_lock.Release() has made this guarantee.  This is why when you write code with locks you don't need to worry as much about the barrier semantics -- the locks include them.

    What Asaf points out is true of the interlocked semantics on InterlockedExchange.  There are platforms (like XBOX: see http://msdn.microsoft.com/en-us/library/ee418650(VS.85).aspx) where the interlocked operations do not have barrier semantics unless they are explicit.  On Windows platforms, InterlockedXxx functions have full fence semantics.

    • Marked as answer by Dana Groff Tuesday, May 18, 2010 12:20 AM
    Thursday, May 13, 2010 5:09 PM

All replies

  • Hi,

    Consider InterlockedIncrement as:

    Lock
      A++
    Unlock

    Where the lock object is global to the CPU so every core knows it.

    Using interlocked will still not protect you from someone else doing A++ in a non-interlocked operations

    Regards,

    Asaf

     


    Asaf Shelly [Microsoft MVP]
    Thursday, May 13, 2010 9:13 AM
  • Hi,

    Consider InterlockedIncrement as:

    Lock
     A++
    Unlock

    Where the lock object is global to the CPU so every core knows it.

    Using interlocked will still not protect you from someone else doing A++ in a non-interlocked operations

    Regards,

    Asaf

     


    Asaf Shelly [Microsoft MVP]

    ...this is amazingly unrelated to my question
    Thursday, May 13, 2010 9:42 AM
  • :-)

    All you know is that an interlocked operation makes sure that another interlocked operation cannot access memory at the same time (duration of Read-Modify-Write). I am prety sure that anything else is processor specific.


    Asaf Shelly [Microsoft MVP]
    Thursday, May 13, 2010 2:23 PM
  • Yes -- the full fence generated by the InterlockedExchange will guarantee that the writes to A, B, and C are not reordered past the fence implicit in the InterlockedExchange call.  This is the point of memory barrier semantics.  They do not need to be on the same cache line.  Were that not true, think of what would happen if you were to write code similar to the following (and how the lock might be implemented): 

    //
    // pStruc has A = B = C = 0
    //
    some_lock.Acquire();
    pStruc->A = 1;
    pStruc->B = 2;
    pStruc->C = 3;
    some_lock.Release();

    The lock release above includes the appropriate memory barriers (acquire/lfence on the acquire and release/sfence on the release) to guarantee that if another thread observes the lock as released after the above code executes, it also observes A = 1, B = 2, C = 3.  These objects likely are not on the same cache line.  The fence implicitly supplied by the some_lock.Release() has made this guarantee.  This is why when you write code with locks you don't need to worry as much about the barrier semantics -- the locks include them.

    What Asaf points out is true of the interlocked semantics on InterlockedExchange.  There are platforms (like XBOX: see http://msdn.microsoft.com/en-us/library/ee418650(VS.85).aspx) where the interlocked operations do not have barrier semantics unless they are explicit.  On Windows platforms, InterlockedXxx functions have full fence semantics.

    • Marked as answer by Dana Groff Tuesday, May 18, 2010 12:20 AM
    Thursday, May 13, 2010 5:09 PM
  • Heavy,

    Joe Duffy's book covers this sort of stuff in great detail. It's where I go when I want to double check my understanding of this sort of issue.

    http://www.amazon.com/gp/product/B0015DYKI4/ref=pd_lpo_k2_dp_sr_2?pf_rd_p=486539851&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=032143482X&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=1SYNPFWZ9C77AAF0DP84

    Ade


    Ade
    Monday, May 17, 2010 5:27 PM