udf function performance question?
-
jueves, 12 de abril de 2012 14:34
Experts,
We have udf function as:-
ALTER FUNCTION [dbo].[ufnStStr](@string VARCHAR(MAX), @items VARCHAR(MAX)) RETURNS VARCHAR(MAX) AS BEGIN WHILE LEN(@items) > 0 BEGIN SET @string = REPLACE(@string, SUBSTRING(@items, 1, 1), '') SET @items = REPLACE(@items, SUBSTRING(@items, 1, 1), '') END RETURN @string END
SELECT dbo.ufnStStr('CARRANZA', 'aeiou')But above udf gives performance issue when using against table which has records approx 100K, can anybody help me in writing above function with same expected result set where I can gain performance.
Thanks in advance
Apex
Please do let us know your feedback. Thank You - KG, MCTS
Todas las respuestas
-
jueves, 12 de abril de 2012 15:00
I think that you should use/evaluate SQL Server CLR Integration.- Propuesto como respuesta Naomi NMicrosoft Community Contributor, Moderator jueves, 12 de abril de 2012 15:27
-
jueves, 12 de abril de 2012 15:06
Hi,
Can you please give me some sample thoughts?
OR
Can you please help in writing my udf into your clr integration, please?
Thanks
Apex
Please do let us know your feedback. Thank You - KG, MCTS
-
jueves, 12 de abril de 2012 15:10
You could make the function more efficient if:
1) Don't declare @items as varchar(MAX)
2) If you are using a Windows collation, use nvarchar rather than than varchar.
3) Run the loop as:WHILE @i <= len(@items)
BEGIN
SET @string = REPLACE(@string, SUBSTRING(@items, @i, 1), '')
SET @i = @i + 1
ENDHowever, you would only gain maybe 30% this way. If you rewrite the function in C# you can use the RegEx class and you will not need the loop at all.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Propuesto como respuesta Naomi NMicrosoft Community Contributor, Moderator jueves, 12 de abril de 2012 15:27
- Marcado como respuesta _Kumar jueves, 12 de abril de 2012 15:56
-
jueves, 12 de abril de 2012 15:23
To answer your question: UDFs involve overhead. When it's a table valued UDF, that overhead is mostly absorbed and not noticable because the UDF is run just once and returns set based data. A UDF that is called once per row in a rowset is run over and over, with the overhead happening for every call. So often, a UDF called over and over with a big resultset will be noticably slower. And for real trouble, run a table-valued UDF and call it repeatedly.
New topic but related: Do you actually need a generic function to serve multiple purposes (the @item string changing often), or is this mainly a single purpose "Get rid of vowels" function? This would do the same thing (replace the literal "carranza" with a column or variable), or define a computed column in your table that does this.
Example1: /* suitable for copy/paste */ Select replace(replace(replace(replace(replace('Carranza', 'A', ''), 'E', ''), 'I', ''), 'O', ''), 'U', '') as DeVowelled Example2: /* Computed column */ Declare @TT Table (Name varchar(99), Devowelled as replace(replace(replace(replace(replace(Name, 'A', ''), 'E', ''), 'I', ''), 'O', ''), 'U', '') ) Insert @TT Select 'Carranza' UNION ALL Select 'Smith' UNION ALL Select 'Yeoman' Select * from @TTEDITED: Another approach could be a view:
Create VIEW <viewsname> as SELECT *, replace(replace(replace(replace(replace('Carranza', 'A', ''), 'E', ''), 'I', ''), 'O', ''), 'U', '') as DeVowelled FROM <originaltablename>Interestingly, this is one of those rare situations where "select *" could be considered acceptable (a view like this, just adding a virtual column)
- Editado johnqflorida jueves, 12 de abril de 2012 15:34
-
jueves, 12 de abril de 2012 16:55
Try this one.
CREATE FUNCTION dbo.fnMaskString ( @OriginalString VARCHAR(2048), @MaskString VARCHAR(2048) ) RETURNS TABLE AS RETURN ( SELECT Data FROM ( SELECT '' + o.theChar FROM ( SELECT Number, SUBSTRING(@OriginalString, Number, 1) AS theChar FROM master.dbo.spt_values WHERE Type = 'P' AND Number BETWEEN 1 AND LEN(@OriginalString) ) AS o LEFT JOIN ( SELECT SUBSTRING(@MaskString, Number, 1) AS theChar FROM master.dbo.spt_values WHERE Type = 'P' AND Number BETWEEN 1 AND LEN(@MaskString) ) AS m ON m.theChar = o.theChar WHERE m.theChar IS NULL ORDER BY o.Number FOR XML PATH('') ) AS d(Data) )
You use it like this
SELECT f.Data FROM dbo.Table1 AS t CROSS APPLY dbo.fnMaskString(t.Col1, t.Col2) AS f
N 56°04'39.26"
E 12°55'05.63"- Propuesto como respuesta Naomi NMicrosoft Community Contributor, Moderator jueves, 12 de abril de 2012 16:58
-
jueves, 12 de abril de 2012 22:09
You can find a quick introduction here:
http://msdn.microsoft.com/en-us/library/w2kae45k.aspxSince we're SQL people here, not everyone here may be able to just type the C# you need, but y'know: the .Net Framework comes with an extensive documentation. I use it a lot the few times I write .Net code!
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se -
sábado, 14 de abril de 2012 9:59
Saturday morning and I had some time over to compose a sample. Beware though that there are many ways to skin the cat, and there may be more efficient ways of doing this in C#. I am by no means an expert in .Net and all the framework classes. I can get by thanks to my skills in reading the manual. :-)
Anyway, here is the C# code:
using System;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlString stripchars(SqlString input, SqlString bad_chars)
{
// Return NULL if any of the inputs are NULL.
if (input.IsNull || bad_chars.IsNull) {
return SqlString.Null;
}// Convert the input parameters to regular .Net types. For the
// input, use char[] for efficiency.
char[] s = input.Value.ToCharArray();
String bad = bad_chars.Value;// Allocate the return string. Assume that size is equal to input.
char[] ret = new char[s.Length];
int ret_ix = 0;// Copy all characters to return string that are not blacklisted.
for (int ix = 0; ix < s.Length ; ix++) {
if (bad.IndexOf(s[ix]) == -1) {
ret[ret_ix++] = s[ix];
}
}// Small performance tweak: if we did not strip any characters,
// return input and do not allocate new string.
if (ret_ix == s.Length) {
return input;
}
else {
return new SqlString(new String(ret, 0, ret_ix));
}
}
};Save it as stripchars.cs. You compile it with:
csc /target:library stripchars.cs
If you don't have csc in your path, you can find a C# compiler in C:\Windows\Microsoft.NET\Framework\v2.0.50727.
If you have Visual Studio installed, be careful! Compile with .Net 2.0. If you compile with .Net 4 SQL 2008 will not like you.
Once you have compiled the file, you run this in SQL Server:
CREATE ASSEMBLY stripchars FROM 'F:\Utveckling\SQL2005\stripchars.dll'
go
CREATE FUNCTION stripchars(@input nvarchar(MAX), @bad_chars nvarchar(MAX))
RETURNS nvarchar(MAX)
AS EXTERNAL NAME stripchars.UserDefinedFunctions.stripcharsReplace the path with the actual path where you have the DLL. If SQL Server runs from a different machine from yours, keep in mind that the path is from the perspective of SQL Server and you may have to put the DLL on a share where SQL Server can read it.
Once you have it in place, you can use it as your current function with the caveat that the overload of IndexOf that I use performs a code-point comparison, and thus is case-sensitive.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

