# the mean of using Association with importance and probability

### Question

• hi,
i have a exercise using association datamining
my database have 350 records,
i use 90 records for datamining and it release some rules which i choose on top of mSOLAP_NODE_SCORE,
but when i use select statement to check my result i have 1 records, the same as my result, and 5 records not true;
for example:
rules A=a,B=b-> C=c
select * from <my_table> where A='a' and B='b' and C='c'; ==>1 record return
select * from <my_table> where A='a' and B='b' and C<>'c'; ==>5 records return
C with 3 values c1,c2,c
with the second statement C includes 2 c1 and 3 c2

i don't understand how they work.
i want to choose some best rules can present my database.
how can i choose importance and probability to get best rules.
with database have 90 records and a database have 350 records which values i should use for minimum_probability, Minimum_Support, Minimum_importance...
when i choose rules i should choose on importance or probability.

Thursday, April 12, 2007 2:16 AM

• I'm really having trouble understanding your question and what you want to see as a result.

Minimum_Support is simply how many times an event has to happen to be counted.  For example, if set Minimum_support to 10, then the "itemset" Aa,Bb,Cc would have to happen together at least 10 times before it was counted at all.  If you set Minimum_support to 0.1, then it would have to happen together in 10% of all cases.

Minimum_Probability is the minumum ratio allowed for something to become a "rule".  For example, if Minimum_probability was set to 0.4 (40%) and Aa, Bb appeared 10 times in your data, then Aa, Bb, Cc would have to appear at least 4 times in the data for the rule Aa, Bb -> Cc to be considered a rule.  (Note that your Minimum_Support would also have to allow for the "temset" to be counted at all),

Minimum_Importance is a calculation that further filters rules based on the amount of lift they provide - the purpose is to filter out tautologies, e.g. "Everybody buys milk, so Cookies->Milk is true with 100%".   This rule is not important, since <anything>->Milk would also be 100%.

HTH

-Jamie

Tuesday, April 17, 2007 6:39 PM

### All replies

• nobody help me..
i need an answer. do i?
Saturday, April 14, 2007 1:30 AM
• help me
can anybody help me?
thanks for your interesting in my question.
Saturday, April 14, 2007 4:12 PM
• I'm really having trouble understanding your question and what you want to see as a result.

Minimum_Support is simply how many times an event has to happen to be counted.  For example, if set Minimum_support to 10, then the "itemset" Aa,Bb,Cc would have to happen together at least 10 times before it was counted at all.  If you set Minimum_support to 0.1, then it would have to happen together in 10% of all cases.

Minimum_Probability is the minumum ratio allowed for something to become a "rule".  For example, if Minimum_probability was set to 0.4 (40%) and Aa, Bb appeared 10 times in your data, then Aa, Bb, Cc would have to appear at least 4 times in the data for the rule Aa, Bb -> Cc to be considered a rule.  (Note that your Minimum_Support would also have to allow for the "temset" to be counted at all),

Minimum_Importance is a calculation that further filters rules based on the amount of lift they provide - the purpose is to filter out tautologies, e.g. "Everybody buys milk, so Cookies->Milk is true with 100%".   This rule is not important, since <anything>->Milk would also be 100%.

HTH

-Jamie

Tuesday, April 17, 2007 6:39 PM