SQL Server Developer Center >
SQL Server Forums
>
Data Mining
>
How to choose the right model for deployment ?
How to choose the right model for deployment ?
- Model A = considerably more accurate than model B on training data
Model B = more accurate than model A on validation data
So which one would you consider for final deployment ?
why?
i was thinking if Model B is better.
Answers
I would check couple of things,
1. Try training with different parameters (e.g. dissable the feature selection or try with different values (input/output/States attributes) and see if it improves your accuracy)
Any recommendation how can i fine tune ?
1. Validating your training data(fine tune) : collect the cases where model B couldn't predict the training data. Depending upon requirements, try to consolidate it with other similar cases or even try to train without those cases and see if it improves your prediction)
2. Try with different algorithm to see if that other algorithm can predict those cases where your Model B couldn't and try to implement water-fall approach where you use different algorithm and different models to improve your results.
it all depends on your requirement and how you can collectively use the power of Analysis services data mining and your own analysis to get the ball rolling. I have cases where I've trained my models hundreds of times on different machines and spent months tweaking the training data. When I couldn't see any improvement on certain classes.
hth,
Rok- Marked As Answer byJin ChenMSFT, ModeratorWednesday, November 11, 2009 9:52 AM
- Proposed As Answer byrok1 Tuesday, November 03, 2009 6:40 PM
All Replies
You cannot rely on Model A.
Model B, Have you tried training it multiple times with different parametrization and on different algorithms? Preparation of the Training data is the most time consuming in any DM projects, so you can collect all the cases where Model B didn't predict the training data itself correctly and try to see if you can tune it. Most of the times 10 percent of sample training data should represent the universe. However, sometimes the distribution of classes in your Training table will influence the prediction too.
hth,
Rok
So are you saying that I should check on Model B first and make sure that the training data is fine-tuned before I can use it ?
You cannot rely on Model A.
Model B, Have you tried training it multiple times with different parametrization and on different algorithms? Preparation of the Training data is the most time consuming in any DM projects, so you can collect all the cases where Model B didn't predict the training data itself correctly and try to see if you can tune it. Most of the times 10 percent of sample training data should represent the universe. However, sometimes the distribution of classes in your Training table will influence the prediction too.
hth,
RokAny recommendation how can i fine tune ?or have you actually experience this before , care to share ?thks
I would check couple of things,
1. Try training with different parameters (e.g. dissable the feature selection or try with different values (input/output/States attributes) and see if it improves your accuracy)
Any recommendation how can i fine tune ?
1. Validating your training data(fine tune) : collect the cases where model B couldn't predict the training data. Depending upon requirements, try to consolidate it with other similar cases or even try to train without those cases and see if it improves your prediction)
2. Try with different algorithm to see if that other algorithm can predict those cases where your Model B couldn't and try to implement water-fall approach where you use different algorithm and different models to improve your results.
it all depends on your requirement and how you can collectively use the power of Analysis services data mining and your own analysis to get the ball rolling. I have cases where I've trained my models hundreds of times on different machines and spent months tweaking the training data. When I couldn't see any improvement on certain classes.
hth,
Rok- Marked As Answer byJin ChenMSFT, ModeratorWednesday, November 11, 2009 9:52 AM
- Proposed As Answer byrok1 Tuesday, November 03, 2009 6:40 PM
- Thanks Rok.
I will see what i can do.


