none
choosing datatypes for optimal performance RRS feed

  • Question

  •  I am wondering what the best practice is for choosing datatypes in a C# project that will be writing to a sql server 2005

    database. This is the conclusion that I have made from reading a couple articles. Though none of the articles said this explicitly. This is why I am looking for a correction or a confirmation.

     

    I should carefully choose the types in order to get the best performance from my app. This should be done for both my C# and Sql Server data types. Careful consideration should be given when deciding what types to use and what they will be used for. Both the SqlServer and C# types should be ones that are the most similar in type and size for a given situation.

     

    This is should be done to maximize performance. This will be achieved by reducing the number of implicit and explicit conversions that need to be done. It will also help to eliminate the need for casting. It will also enable the most efficient allocation of memory. Is this correct?

     

                                                                        -thanks

    Monday, January 21, 2008 9:47 PM

Answers

  • Yes, yes, and yes. Sufficient thought should be put into what datatypes are required for the field/variable. As the application grows, the performance trade-offs will become increasingly evident. Never discard the benefits of well thought out datatypes.

     

    Microsoft goes to great lengths to ensure developers have the ability to optimize when needed. So take advantage of it. Don't allow your application to suffer performance due your inability to take the time to make a good decision on every variable/field. To argue otherwise would be an admittance of small project/small minded experience.

     

    Not only should the variable type match the field type, both should be as minimized as possible.

     

    Yes avoid type casting with generics and use value types from the stack when applicable.

     

    Contrary to popular belief, this practice is advanced, in nature, and is refined the more you encounter performance issues.

     

    Adam

    Monday, January 21, 2008 10:08 PM

All replies

  • To be honest, I really wouldn't get very excited about the performance difference between a tinyint/int/long etc. Pick the data-type that is most suitable for your needs both now and in the foreseeable future. You are *generally* going to see most performance benefits from higher-level concerns:

    • using appropriate indexing stragies in the database
    • using appropriate data-access approaches in the app [i.e. don't read the entire table to update 1 line]
    • using appropriate containers in the app - i.e. knowing when a list is useful vs a dictionary, etc
    • thinking about cacheing
    • writing searches sensibly so that they do the minimum work
    • using parameterised queries or stored procedures to minimise SQL recompiles, and understanding things like recompilation due to: DML/DDL interleave, >6 updates, etc
    • making correct use of transactions and understanding blocks

    Most of this you should be able to profile (at least to some extent) using the available profiling tools.

     

    That said - don't use a char[2000] if a varchar(200) will do - obviously filling up your pages *will* hurt the database, but you'll notice that in the disk/IO counters (perfmon) - if the database is hammering the disk you have a problem.

     

    Marc

    Monday, January 21, 2008 9:55 PM
  • > "Both the SqlServer and C# types should be ones that are the most similar in type and size"

     

    Further - yes, you are correct that it would be useful to keep them fairly-well pegged, but this is mainly so that you don't get issues with one having a number that is too big for the other. The performance is largely incidental: if we need to use a 64-bit integer, then fine: use a 64-bit integer on both sides. At the DAL you (or the ORM tool) will need to do the translation (via the appropriate parameter/reader etc objects) anyway.

     

     

     

    Monday, January 21, 2008 9:59 PM
  • Yes, yes, and yes. Sufficient thought should be put into what datatypes are required for the field/variable. As the application grows, the performance trade-offs will become increasingly evident. Never discard the benefits of well thought out datatypes.

     

    Microsoft goes to great lengths to ensure developers have the ability to optimize when needed. So take advantage of it. Don't allow your application to suffer performance due your inability to take the time to make a good decision on every variable/field. To argue otherwise would be an admittance of small project/small minded experience.

     

    Not only should the variable type match the field type, both should be as minimized as possible.

     

    Yes avoid type casting with generics and use value types from the stack when applicable.

     

    Contrary to popular belief, this practice is advanced, in nature, and is refined the more you encounter performance issues.

     

    Adam

    Monday, January 21, 2008 10:08 PM
  • Marc,

     

    Third party monitoring tools are not only inaccurate but substandard in practice. The importance of understanding the after affect of your decision in datatypes is imperative in ensuring the success of your application. You can read until you're blue in the face, but only experience will illuminate the importance of decisions made in the design stage.

     

    Adam

    Monday, January 21, 2008 10:19 PM
  • I tried to find the article that I saw this in but I couldn't. It was a table of types. It was intended to be used for cross reference. I hadn't seen some of the types that were in the table.

     

    Many of them began with Sql. For example, they were similar in name to SqlInt. The table indicated that this was a SQL Server type. The ones that began with SQL seemed to have more that were a closer match to the types offered by C#. This made me think that they would be good choices to use in my database.  

     

    Are you familiar with these? I don't recall seeing these in any of the other articles or books that I have read. Do you think it is a good to use these?

     

                                                                -thanks

     

     

    Monday, January 21, 2008 10:43 PM
  • > Third party monitoring tools are not only inaccurate but substandard in practice.

     

    But a lot more accurate than *not* profiling and simply praying. Some are actually quite good. If you haven't got some kind of profile (no matter how crude or sophisticated), you simply have no way of judging the impact, or focusing effort.

     

    And again - in most cases the bottle-neck is going to be a higher level than simply the data-type : more *how* it is used - nested loops vs dictionary? set based database operations vs cursor? etc.

     

    Marc

    Monday, January 21, 2008 10:52 PM