none
Bad performance when using string arguments in CLR function RRS feed

  • Question

  • I cannot figure out how to compile performance efficient C# CLR functions in a Visual Studio SQL Server Database Project. Everything works flawlessly and the performance is good as long as i stick to integer/float arguments. If, however, I need a string argument, the calling time increases by a factor of 20.

    I would expect strings to be slower than integers, but not this much slower. Also I have an old assembly (not sure how it was made) that performs much, much better. I have decompiled this assemply into a C# CLR function and then recompiled it again. The new assembly works perfectly, except the function now takes almost 4 times as long to run.

    If you want to see some numbers, the list below shows the performance of my function before and after the recompilation as well as empty functions that accepts either strings or integers as arguments (every function accepts 2 arguments).

    • MyFunc original: 7 s
    • MyFunc recompiled: 27 s
    • EmptyFuncString: 19 s
    • EmptyFuncInteger: 1 s

    The performance loss on my recompiled function seems to match the added cost of using string arguments. 

    (The numbers shown above are the times it take to run the function approx. 6 mio times in MS SQL Server management Studio 2012, on a 2.5Ghz, 6 Core Intel Xeon CPU. They are all reproducible).


    • Edited by JesperKT Tuesday, November 17, 2015 12:32 PM
    Tuesday, November 17, 2015 12:31 PM

All replies

  • Using string functions are, of course, less optimizable than integer functions, especially with constructs like empty methods, but also because CLR strings are immutable (string mutator methods actually make copies). In addition, using SqlString can cause additional string copying and is less optimizable than using SqlChars in SQLCLR. See https://msdn.microsoft.com/en-us/library/ms131075.aspx for a minimalist statement of this, there was a more complete treatment of this in an early SQLCLR whitepaper, but I don't see link to it any more.
    Tuesday, November 17, 2015 5:41 PM
    Moderator
  • I would propose you to read the great article of Dmitri Korotkevitch — CLR vs. T-SQL: Performance considerations it explains your case in details!
    Tuesday, November 17, 2015 8:16 PM
  • Thanks a lot for the replies.

    I have tried with both SqlString and SqlChars but I cannot find any noticeably difference in performance. Also this doesn't seem to explain why the function is suddenly much slower after a decompile and recompile (both versions are using SqlString).

    Maybe I am doing something wrong. I will read through the two linked article and see what I can learn.

    • Edited by JesperKT Friday, November 20, 2015 12:30 PM
    Friday, November 20, 2015 9:41 AM
  • I have read through both articles, and though illuminating in other regards, they do not seem to cover my exact problem.

    I have an existing function on the DB. Using "set statistics time on" I have measured the CPU time it takes to run this function (calculated on 1 mio runs) to 0.3 us (microseconds).

    I have decompiled this function and compiled it again, and now it suddenly takes 24 us (almost 100 times as long to run). How can this be?

    The function takes two strings as arguments. Even if i remove everything inside the function and simply return a value, it still still takes 17 us to run (or 12 us if I change the input arguments to SqlChars).

    The only way I can reach values near the original functions, is if I change my arguments into integers. This reduces the CPU time to approx 0.1 us.

    My conclusion is that something about using strings as arguments is behaving very badly in my compilation (but not in the compilation that originally created the function, which unfortunately is lost).

    • Edited by JesperKT Friday, November 20, 2015 1:06 PM
    Friday, November 20, 2015 1:04 PM
  • I have decompiled this function and compiled it again, and now it suddenly takes 24 us (almost 100 times as long to run). How can this be?

    The function takes two strings as arguments. Even if i remove everything inside the function and simply return a value, it still still takes 17 us to run (or 12 us if I change the input arguments to SqlChars).

    The only way I can reach values near the original functions, is if I change my arguments into integers. This reduces the CPU time to approx 0.1 us.

    My conclusion is that something about using strings as arguments is behaving very badly in my compilation (but not in the compilation that originally created the function, which unfortunately is lost).

    Make sure you are doing the following:

    • In the SqlFunction attribute, set IsDeterministic = true
    • In the SqlFunction attribute, remove DataAccess and SystemDataAccess if they are there and set to Read. They default to None.
    • Since you are using Visual Studio / SSDT, make sure that your Project Properties have the "optimize code" option checked. By default, this is UNchecked for Debug, and checked for Release. So either use Release OR check the option when using Debug.
    • Whether using SqlString or SqlChars, specify the input params using: [SqlFacet(MaxSize = 4000)] SqlString param. This will ensure that the generated CREATE FUNCTION statement uses NVARCHAR(4000) instead of NVARCHAR(MAX). There is a huge performance hit when using MAX. Earlier versions of SSDT (formerly "Data Dude") would default to using NVARCHAR(4000) for SqlString and NVARCHAR(MAX) for SqlChars. I believe the newer versions of SSDT using NVARCHAR(MAX) for both, and this could explain why the prior compiled version is faster. But then that should be obvious by looking at the CREATE FUNCTION definition of it in your DB.

    Thursday, January 28, 2016 9:14 PM