how to rearrange clusters ? manually or programmatically? ideo.<br/>And having watched it I am treybg to configure the DM models in order to use predictions (engaging the use of MS Clustering algorithm).<br/><br/>During creation of model I see the warning or &quot;help&quot; message:<br/><em>&quot;Input data will be <strong>randomly</strong> split into two sets, a training set and a testing set, <br/>based on the percentage of data for testing and maximum number of cases in testing data set you provide. <br/>The training set is used to create the mining model. The testing set is used to check model accuracy.&quot;</em><br/><br/>This is very nice!<br/> Is there any way to switch off the randomness and split it manually?<br/><br/>I am also interested to know whether it is possible to define cluster creation manually or programmatically ?<br/>or rearrange clusters ?<br/><br/>PS<br/>Added later.<br/>I cannot be mute on it.<br/>My client saw Excel 2007 Add-in &quot;Exception highlighting&quot; video.<br/> Having wathed and listened it, he insists that Microsoft Clustering Algorithm arrange clusters according to probabilitits. <br/>I.e. it creates clusters with exceptions (anomalies or outliers).<br/>And he wants to have such clusters...<br/><br/>So, is it possible to satisfy such a wish?<br/>tmoving exceptions to separate cluster(s)? <br/> <hr class=sig> Guennadi Vanine -- Gennady Vanin -- Геннадий Ванин© 2009 Microsoft Corporation. All rights reserved.Sun, 12 Jul 2009 15:46:15 Z9e29d803-2ab8-4778-9b24-59e5278f6844http://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#9e29d803-2ab8-4778-9b24-59e5278f6844http://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#9e29d803-2ab8-4778-9b24-59e5278f6844Guennadiy Vaninehttp://social.msdn.microsoft.com/Profile/en-US/?user=Guennadiy%20Vaninehow to rearrange clusters ? manually or programmatically? ideo.<br/>And having watched it I am treybg to configure the DM models in order to use predictions (engaging the use of MS Clustering algorithm).<br/><br/>During creation of model I see the warning or &quot;help&quot; message:<br/><em>&quot;Input data will be <strong>randomly</strong> split into two sets, a training set and a testing set, <br/>based on the percentage of data for testing and maximum number of cases in testing data set you provide. <br/>The training set is used to create the mining model. The testing set is used to check model accuracy.&quot;</em><br/><br/>This is very nice!<br/> Is there any way to switch off the randomness and split it manually?<br/><br/>I am also interested to know whether it is possible to define cluster creation manually or programmatically ?<br/>or rearrange clusters ?<br/><br/>PS<br/>Added later.<br/>I cannot be mute on it.<br/>My client saw Excel 2007 Add-in &quot;Exception highlighting&quot; video.<br/> Having wathed and listened it, he insists that Microsoft Clustering Algorithm arrange clusters according to probabilitits. <br/>I.e. it creates clusters with exceptions (anomalies or outliers).<br/>And he wants to have such clusters...<br/><br/>So, is it possible to satisfy such a wish?<br/>tmoving exceptions to separate cluster(s)? <br/> <hr class=sig> Guennadi Vanine -- Gennady Vanin -- Геннадий ВанинThu, 02 Jul 2009 08:31:51 Z2009-07-02T08:56:53Zhttp://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#87115ae6-d48c-4bb0-90ca-0eb712763a63http://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#87115ae6-d48c-4bb0-90ca-0eb712763a63Allan Mitchellhttp://social.msdn.microsoft.com/Profile/en-US/?user=Allan%20Mitchellhow to rearrange clusters ? manually or programmatically? Hi <div><br/></div> <div>To create your own Testing and training sets then you can use SSIS (or any other method you choose) to split the original dataset into 2.  The premise holds though that the two sets of data should be representative of the whole.</div> <div><br/></div> <div> from this page <span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:12px;white-space:pre">http://technet.microsoft.com/en-us/library/ms131977.aspx</span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre">Using the wizard you will be default get a 70/30 split. You could change that to 100/0</span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre">Using DM you have to manually specify WITH HOLDOUT (&lt;option&gt;)</span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre">You can programmatically (API and DMX) specify the algorithm parameter values.</span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Tahoma, Arial, Helvetica, sans-serif;font-size:small"><span style="font-size:12px;white-space:pre"> </span></span></div>Thu, 02 Jul 2009 09:14:29 Z2009-07-02T09:14:29Zhttp://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#ad71ed8f-4eed-40fd-bad5-2b5d77749d59http://social.msdn.microsoft.com/Forums/en-US/sqldatamining/thread/9e29d803-2ab8-4778-9b24-59e5278f6844#ad71ed8f-4eed-40fd-bad5-2b5d77749d59Vladimir Cupalhttp://social.msdn.microsoft.com/Profile/en-US/?user=Vladimir%20Cupalhow to rearrange clusters ? manually or programmatically? Regarding the second part of the question and PS.. although you will be able to control behaviour of Microsoft Clustering algorithm to some extent, there are limitations, which you are now probably close to. As far as I know, you are not able with Microsoft Clustering algorithm to define exactly how are clusters created (for example their exact centers) or how the final results will be stored in node structure. To be able to create clustering model completetely to your wishes, I would recommend writing your own clustering plug-in algorithm. Even though creating your own algorithm (writing the code) may complicate things at first, you will be then completely in charge of all those issues you mentioned. <div><br/></div> <div>Best regards</div> <div>Vladimir Cupal</div>Sun, 12 Jul 2009 15:46:15 Z2009-07-12T15:46:15Z