Is there a way to query the Mining Model to determine the Discretization Method used when a model is processed, if you have left the DiscretizationMethod to Automatic?
I can infer it if I use the RangeMin & RangeMax functions to work out the range and number of buckets used, then reverse engineer that back to my data to work out the populations of each bucket. If each bucket is equally populated then it's used EQUAL_AREAS otherwise it's gone for CLUSTERED.
But... it would be nice if I could just collect the decision directly from the model. Anybody any the wiser?
Hi BI Monkey,
When you left the DiscretizationMethod to Automatic, Analysis Services determines which discretization method to use. If your data mining solution uses data from an Online Analytical Processing (OLAP) cube, the data mining algorithm automatically computes the number of buckets to generate by using the following equation, where n is the number of distinct values of data in the column: Number of Buckets = sqrt(n)
For more information about it, please see:
Discretization Methods (Data Mining): http://msdn.microsoft.com/en-us/library/ms174512(v=sql.105).aspx
Please remember to mark the replies as answers if they help and unmark them if they provide no help. This can be beneficial to other community members reading the thread.
What I want to know is after the model has run, which method was chosen. The above link just tells me what option it could choose.
I don't think you can get the discretization method per se. The information isn't even output in the model parameters so you have to look at the model content on a per-attribute basis.
You can browse the bucketed values by using the Generic Content Viewer, or you can query the model content and use the NODE_DISTRIBUTION values. Then get the rows with VALUETYPE of 5.
SELECT FLATTENED NODE_DISTRIBUTION as t from [MyBinnedModel].CONTENT
The information on valuetypes is kind of buried here, about 2/3rds down the page.
Let me know if these aren't the values you want.
SQL Server UE, Data Mining