Answered by:
SQL User Defined Functions for performing statistical calculations

Hi!
I hope you can help. I just wasn’t sure where to go with this question, so I’m hoping you can at least point me in the right direction.
I’m writing a SQL Server stored procedure that returns information for a facilitywide scorecardtype report. The row and columns are going to be displayed in a SQL Server Reporting Services report.
Each row of information contains “Current Month” and “Previous Month” numbers and a variance column. Some rows may compare percentages, others whole numbers, others ratios, depending on the metric they’re measuring. For each row/metric the company has specified whether they want to see a ttest or a chisquared statistical test to determine whether or not there was a statistically significant difference between the current month and the previous month.
My question is this: Do you know where I can find a set of alreadywritten user defined functions to perform statistical calculations beyond the basic ones provided in SQL Server 2005? I’m not using Analysis Services, so what I’m looking for are real SQL User Defined Functions where I can just pass my data to the function and have it return the result within a stored procedure.
I’m aware that there may be some thirdparty statistical packages out there we could purchase, but that’s not what I’m looking for. And I’m not able to do anything like call Excel’s analysis pack functions from within my stored procedure. I’ve asked. They won’t let me do that. I just need to perform the calculation within the stored procedure and return the result.
Any suggestions? Is there a site where people are posting their SQL Server UDF’s to perform statistical functions? Or are you perhaps aware of something like a free addin for SQL that will add statistical functions to those available in SQL? I just don’t want to have to write my own ttest function or my own chisquared function if someone has already done it.
Thanks for your help in advance! Oh, and please let me know if this should have been posted in the TSQL forum instead. I wasn't entirely sure.
Karen Grube
Question
Answers

Karen,
I am not aware of any TSQL library for statistical functions.
Perhaps you can start building your own. Naturally, libraries built in C++ can be hooked into SQL Server as CLRs.
The following link contains among other statistical calculations, MEDIAN calculation.
http://www.sqlusa.com/bestpractices2005/median/
Kalman Toth SQL SERVER 2012 & BI TRAINING
New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012 Edited by Kalman Toth Friday, September 28, 2012 6:22 AM
All replies

TSQL may not be fast enough for scientific and advanced statistics applications. It is an interpreted language.
This package has 45 advanced statistical functions. You may be able to plug them into TSQL:
http://www.sharewareconnection.com/mathmechanixs.htm
Kalman Toth SQL SERVER 2012 & BI TRAINING
New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012
 Edited by Kalman Toth Friday, September 28, 2012 6:24 AM

Thanks, but I realy couldn't see how these functions could be called from within a stored procedure, and I don't see the ttest function.
Any other suggestions? I'm only processing a small set of numbers at any one time, so it shouldn't be too slow.
Thanks,
Karen

Karen,
I am not aware of any TSQL library for statistical functions.
Perhaps you can start building your own. Naturally, libraries built in C++ can be hooked into SQL Server as CLRs.
The following link contains among other statistical calculations, MEDIAN calculation.
http://www.sqlusa.com/bestpractices2005/median/
Kalman Toth SQL SERVER 2012 & BI TRAINING
New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012 Edited by Kalman Toth Friday, September 28, 2012 6:22 AM

Thanks!
Although there are a few limited statistical calulations available on this site, I can't locate the ttest or chisquared calculations I'm looking for in particular.
Might there be some way I can perform these calculations perhaps by calling on some functionality of Analysis Services through a stored procedure? My theory is that I really shouldn't have to recreate the wheel here, that someone has already done this. Gosh, I can do this in Excel! Why can't I do it in SQL? I'm asking that seriously, not rhetorically.
Any help would be greatly appreciated.
Thanks!
Karen 
Karen,
Have you made any progress towards answering this question? I am seeking methods (best practices) for calculating a variety of statistical functions (including r squared and t score) in native tsql.
So far, I haven't found any techniques or solutions, asides from manually programming the formulas themselves in sql stored procedures, calling external libraries like R, S+ or Matlab, or creating CLR stored procedures using C# or VB.net.
I'm interested in any solution(s) you may have found.
Boon Hogganbeck 

STATS_T_TEST_
docs.oracle.com/cd/B19306_01/server.102/b14200/functions157.htm

STATS_T_TEST_ONE
: A onesample ttest 
STATS_T_TEST_PAIRED
: A twosample, paired ttest (also known as a crossed ttest) 
STATS_T_TEST_INDEP
: A ttest of two independent groups with the same variance (pooled variances) 
STATS_T_TEST_INDEPU
: A ttest of two independent groups with unequal variance (unpooled variances)
