none
Confused about Return Value Optimization

    Question

  • Hi everyone,

    The return value optimization (RVO) eliminates temporaries for functions return by value. Scott Meyers wrote about utilizing RVO in "More effective C++" and Ayman B. Shoukry wrote an article about named return value optimization (NRVO) in this article:

     http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs05/html/nrvo_cpp05.asp

    Those articles left me wondering: in what cases is trying to utilize the RVO beneficial? Is there something I have to do as a programmer to help the compiler to utilize the RVO the best way that it can?

    Both articles suggest that I should. I've tried to put this into practise. VS 2005 does a very good job optimizing and I think if you read mr Shoukry's article, there's very little to worry about in VS 2005, there's little to do as a programmer to get optimal code. It's the VS 2003 compiler and presumably a bunch of other compilers that got me puzzled.

    Scott Meyers suggests to avoid named variables to help compilers utilize the RVO. In the following sample, I have two testcases: one function that has named variables, and one that doesn't have named variables but the method as proposed by Scott Meyers.

    [code language="c++"]

    #include "stdafx.h"

    class Foo
    {
    public:

        int A;

        Foo()
        {
            printf("Constructor\n");
            A = 0;
        }

        Foo( const Foo& rhs )
        {
            printf("Copy constructor\n");
            A = rhs.A;
        }

        Foo( int x )
        {
            printf("Constructor int\n");
            A = x;
        }

        ~Foo()
        {
            printf("Destructor\n");
            A = 0;
        }

        Foo& operator=( const Foo& rhs )
        {
            A = rhs.A;
            return *this;
        }
    };

    const Foo FuncWithNamedVar( int x )
    {
        Foo tmp;        // Named variable!
        tmp.A = x + 5;
        return tmp;
    }

    const Foo FuncWithUnnamedVar( int param )
    {
        return Foo( param + 5 );
    }

    int _tmain(int argc, _TCHAR* argv[])
    {
        printf( "\nNamed ---------\n" );
        {
            Foo foo = FuncWithNamedVar( 10 );
            printf( "%d\n", foo.A );
        }

        printf( "\nUnnamed ---------\n" );
        {
            Foo foo = FuncWithUnnamedVar( 10 );
            printf( "%d\n", foo.A );
        }

        return 0;
    }

     [/code]

    Now, it all seems to be working nicely, as the output is this:

    Named ---------
    Constructor
    Copy constructor
    Destructor
    15
    Destructor

    Unnamed ---------
    Constructor int
    15
    Destructor

    This proves that indeed the named version calls an extra ctor and dtor. So far, so good.

    However, upon closer examination of the generated assembly code, you can see that the ctor and dtor calls all have been inlined in both cases. This results in some additional optimization that seems to throw away all the extra code from the ctors and dtors anyway, EXCEPT for the print functions. So at first glance the named version may seem very inefficient, but that's only because we put the print calls in there.

    So, let's remove those printf's from the ctors and dtor. Just look what happens! The compiler translates both functions FuncWithNamedVar and FuncWithUnnamedVar to exactly the same function! It evaluates both functions to be the same. The way they're called is also identical.

    Foo foo = FuncWithNamedVar( 10 );

    0040102B lea eax,[esp+4]

    0040102F push 0Ah

    00401031 push eax

    00401032 call FuncWithUnnamedVar (401000h) <== HERE

    printf( "%d\n", foo.A );

    00401037 mov ecx,dword ptr [esp+0Ch]

    0040103B push ecx

    0040103C push offset string "%d\n" (406110h)

    00401041 call printf (401071h)

    }

    printf( "\nUnnamed ---------\n" );

    00401046 push offset string "\nUnnamed ---------\n" (4060FCh)

    0040104B call printf (401071h)

    {

    Foo foo = FuncWithUnnamedVar( 10 );

    00401050 push 0Ah

    00401052 lea edx,[esp+1Ch]

    00401056 push edx

    00401057 call FuncWithUnnamedVar (401000h) <== AND HERE

     

    My questions are these:

    a) Are my observations correct?

    b) In what circumstances does it really matter to try to utilize the RVO?

    After digging through it in asm for two days, I have the feeling that really understanding how this works involves a lot of knowledge about the compiler. It also feels like you shouldn't worry all too much about it. I thought I had a powerful optimization trick in my hands but it turns out that in most cases it doesn't really matter.

    Please teach me, enlighten me, tell me all about the subject as I can't let it go. I have to get some sleep again!

    Thanks for reading,

    Jelle

     

     

    Tuesday, December 05, 2006 10:50 AM
    Moderator

Answers

  • I'm ont the compiler team but don't tend to think in terms of RVO or NRVO - I just tend to think in terms of a function that returns an instance of a class by value and if it does can the compiler use the return target instead of the variable/temporary inside the function.

    I find that thinking about it at this level reduces the confusion.

    Friday, January 26, 2007 9:00 PM
    Moderator

All replies

  • 1.  It looks like you analyzed the problem correctly, so I'll take your word for it.

    2. I think what you are seeing here is that the compiler is trying to preserve any side-effects that printf might have.  No optimization should change the behavior, so if the compiler can't account for all side-effects it won't optimize.

    I rely on RVO to make my code a bit more beautified, particularly when trying to treat strings as first class objects: i.e. I let a function return std::string that is named within the function, but rely on the compiler to avoid the extra copy.

    Tuesday, December 05, 2006 4:37 PM
    Moderator
  • Hi Brian,

    first, thanks for the answer.

    "I think what you are seeing here is that the compiler is trying to preserve any side-effects that printf might have"

    Yes, I agree. That's why using such an example is not really helpful, it's not something that will often happen in a real-life situation. A common class would only initialize its own members, without side-effects.

    "No optimization should change the behavior"

    It's funny you should say that. RVO is an optimization that does just that. It eliminates a ctor and dtor, causing the program to show different behaviour. That is okay by the C++ standard. The printf is a good example: the output will become different depending on the optimization strategy. This is the only difference I see in the example I put in my previous post.

    "I rely on RVO to make my code a bit more beautified, particularly when trying to treat strings as first class objects: i.e. I let a function return std::string that is named within the function, but rely on the compiler to avoid the extra copy."

    That makes sense. Now that I think of it, the problem I stated in my post was actually not an RVO problem. The RVO worked fine in both cases: both versions pass a hidden argument to the functions. It is about using named versus unnamed variables. Whenever your compiler does not support NRVO, you would do well to make sure not to use named variables, right? Well, that's the thing I'm not so sure off. My case proves that both using named and unnamed variables show the same result on a compiler not having NRVO. I realise my question is a bit out of place since, strictly speaking, this is a VS 2005 forum. And VS 2005 has NRVO.

    So I'm looking for the situation in which Scott Meyer's suggestion (to use FuncWithUnnamedVars) actually makes sense. Does it make sense if the ctors cannot be inlined? Or if the ctors contains side effects that cannot be optimized? I think in normal circumstances it's not a really big deal. I'm a bit dissapointed about that :). So if anyone can point me to a situation in which you can really win with such a construct, I'd like to see it!

    And in order to rephrase my question to make this post more on-topic, just for the sake of it: what situations does the VS 2005 compiler solve by using NRVO instead of RVO?

    Pff did this make any sense at all? I'm digging too deep, am I not?

    Tuesday, December 05, 2006 6:17 PM
    Moderator
  • I think it's worth noticing that More Effective C++ was written in '95/'96. It's fairly safe to say that optimization techniques has gone through some changes the last 10 years.

    Another thing that seems obvious, is that the code can easily become too complex for the compiler to apply NRVO. If that's the case, a redesign to utilize RVO is probably the right way to go. To me it just seems a lot easier to design for RVO than NVRO.

    That's my two cents, anyway.

    Tuesday, December 05, 2006 7:04 PM
    Moderator
  • Hi einaros,

    True, the book is old. Many things still apply, and what I'm trying to figure out is if and when this part still applies. I think that's the core of my question.

    Tuesday, December 05, 2006 7:16 PM
    Moderator
  • Bump.

    Perhaps someone from the compiler team would like to shed some light? Ayman?

    Sunday, December 10, 2006 9:21 PM
    Moderator
  • I'm ont the compiler team but don't tend to think in terms of RVO or NRVO - I just tend to think in terms of a function that returns an instance of a class by value and if it does can the compiler use the return target instead of the variable/temporary inside the function.

    I find that thinking about it at this level reduces the confusion.

    Friday, January 26, 2007 9:00 PM
    Moderator