Declaration of varaiables with loops - Best practices (i think), so why doesn't this seem right? RRS feed

  • Question

  • My question deals with declaration of variables when dealing with loops.

    It seems to me that if I declare a variable once, then assign to it repeatedly in a loop (Method 1), that it will use less resources and will perform better than if I declare and assign inside the loop (Method 2), because there is only the cost of the repeated assignments and not for repeated declarations.

    For readability and to make sure the objects are marked for collection (not actually collected) in a timely manner then Method 2 is preferable. For performance however, it would seem that Method 1 is the way to go.

    Am I wrong in this line of thinking?

    Method 1. With this example, I am declaring the variables outside the for loop.
    void OutsideTheLoop() 
        System.Web.HttpApplication app; 
        for (int i = 0; i < 10; i++) 
            app = new System.Web.HttpApplication(); 

    Method 2. With this example, I am declaring the variables inside the for loop.

    void InsideTheLoop() 
        for (int i = 0; i < 10; i++) 
            System.Web.HttpApplication app; 
            app = new System.Web.HttpApplication(); 

    After checking the IL that is generated from these, the compiler forces method 2. My pattern of thought (Method 1) is getting overridden at compile time, but I wonder if I am really wrong here. Can anyone please help me understand whether I am wrong or right and why?

    I would really appreciate it if there were no subjective answers on this. Thanks All.

    Monday, December 8, 2008 2:03 PM

All replies

  • The CLR requires that stack frames be fixed size, and that all slots (variables) be known ahead of time for each stack frame.

    In plain english, this means that every method call (each call creates a stack frame) must declare its local variables up front, even those that don't get used immediately or might never get used due to logical branching (if..else, switch, etc.).

    So what this amounts to, is that declaring variables inside a block (such as a loop) is really only meaningful for the compiler, not the output code.

    In the output code (the IL), as you've seen, it makes no difference - the variables all get "allocated" (once) at the top of the IL for the method (in my test, only the stack slot index of each variable changed).

    But on the compiler side, the location of the variable declaration matters in that it controls the "scope" of the variable. In other words, if you declare a variable inside a block, then the compiler won't allow you to reference the variable outside the block.
    -Rob Teixeira
    Monday, December 8, 2008 7:21 PM
  • Thanks Rob.

    After consultation with The Smartest Human Being I Know (TSHBIK), he has pointed out that the reflector representation that I was looking at is not exactly correct, and that in both methods the IL generated was identical (all of the declarations moving to the .locals init() portion of the method - but in a different order). Both you and he are on the mark with this.

    So what I was supposing was at least partially correct, but I cannot test one way against the other because I it doesn't really work that way.

    • Proposed as answer by Deonis Tuesday, January 12, 2010 4:52 PM
    Monday, December 8, 2008 7:58 PM
  • I guess an easier way of looking at this is that the pattern (in a .NET managed language) doesn't matter for performance reasons, but does matter (in some compilers) for the accessability of the variables.

    For example, C# (the compiler) understands the notion of variable block scoping, which means that you can follow the practice of utilizing a variable in its smallest possible scope to keep it from being used where it might not be safe. For example, take a long method that includes some handle variable to an OS resource. If you declare this in code outside of any block, it's usable throughout the entire method. But if you scope it to a block (declare it inside a block), then the compiler won't allow you to use it outside the block. This is helpful because someone applying a bug fix months later won't accidently try to use the handle value to access the OS resource after it potentially gets closed somewhere above in the code.

    But as far as performance goes, the placement of the variable in code really has no effect, at least in most .NET managed languages that i've seen.
    -Rob Teixeira
    Monday, December 8, 2008 9:19 PM
  •  Interesting to point that that, in a non-garbage-collecting language that uses SmarPtrs, the inner declaration would be more efficient, since there's no need to check for a null pointer when freeing during the assignment. In a gc language, there's no difference.

    That being said, you shouldn't be considering factors like this anyway. The overwhelming consideration should be readability and maintainability. By that standard, the inner declaration is overwhelmingly preferrable, both because it makes the code easier to read (you don't have to scroll back to find out what type the variable is), and more maintainable (the variable is exposed to less code, so less code needs to be considered if a bug fix were ever required). Even if there were a difference, the 0.001% performance improvement (or some similarly small factor) would in no way make up for the risk incurred by coding less defensively.

    Tuesday, December 9, 2008 2:37 PM
  • Readability and maintainability are subjective factors. Important to consider in a practical sense, but the question was geared solely towards performance.

    Its not a critical issue, just one that I was curious about since I was making an assumption based on... well nothing. I was lucky to find that it doesnt matter because the generated IL was the same, and that it used the pattern of method top declaration (as IL always does).
    Tuesday, December 9, 2008 7:37 PM