none
Why use Data Mining? What can i do with Data Mining?

    Question

  • Hello,

    I´m new in this amazing world of Data mining, so i think that in order to start in the correct way the better is to ask to the experts.

    What could i get using Data Mining? I have implemented a BI with SSAS and i have my cube with my dimensions but what more could i do whith this, i have invoice, sales, accounts movements, etc...


    Thanks !!!
    Monday, September 22, 2008 10:01 AM

Answers

  • Hello

     

    Jamie also responded to your question with a series of three links.  However, I believe it best to take a shot in words at answering your question.

     

    Data mining has different definitions, and I like the definition of looking for useful patterns in data.  In your case, you provided a standard financial analysis scenario, and data mining can provide information on the transaction patterns and transactors (people who purchase, perhaps vendors or customers). 

     

    Microsoft data mining has come to mean the specific algorithms in the Microsoft SQL Server software.  Some of the ingredient algorithms have been variously defined under predictive analytics or statistical analysis or even multivariate analysis (the term I used in graduate school).  To many people, data mining implies either automated or semi-automatic assistance by computer servers, where the software intelligently searches for better models.  For Microsoft, I recommend only looking at the versions related to SQL Server 2005 and 2008, and the latter is better.

     

    As a starting point, I would like you to start looking at the Excel interface to SQL Server.   That technology requires Excel 2007 and access to SQL Server 2005 or 2008.  You will need to work with a SQL Server expert to best set up a working demo or production environment.  The price paid in having such an expert is required because these tools were made for production high-volume data mining

     

    There are many data mining products available (free and otherwise).  The advantages of the Microsoft Data Mining tools include:

    • Tight implementation with a world-class database (SQL Server) and therefore leveraging the performance, security, and optimization features of this database platform.  This feature is important because Microsoft has created one of the world-leading databases.
    • Programmability through the developed languages -- meaning that you can have a team of developers integrate data mining into your current business intelligence solution.  The integration works best with Windows, but because the interface can be web-based, you need not be in a Windows-based BI solution to integrate these tools.  This feature is important because Microsoft is a creator of languages.
    • Production-quality use and output.  You can return to this forum and ask challenging production-related questions.  I don't see many posted these days, because (in my opinion) data modeling still carries this cultural norm of strong individuals leading the way (in old westerns, John Wayne, or in modern superheros, pick your favorite DC or Marvel character).  As we move forward, this technology already implies team ownership over data mining, and we can expect that future iterations will be thinking of teams working on projects.  Those newer cultural norms come from the SQL Server and business intelligence cultures, and I believe are a welcome addition to how statistical analysts have worked.  We need our heroes, but we also need teams too.

    Data mining is an active research field, and you could spend years reading peer-reviewed articles and textbooks on different aspects of the topic.  The field has been dominated by Ph.D. level people, and there's lots of careful thought behind the not only the algorithms but the statistical philosophies of analysis and synthesis.  Jamie provided a few links leading in those directions, and I would rather have you just study Microsoft Data Mining:

     

     

    Wednesday, September 24, 2008 4:43 AM

All replies

  • Hello

     

    Jamie also responded to your question with a series of three links.  However, I believe it best to take a shot in words at answering your question.

     

    Data mining has different definitions, and I like the definition of looking for useful patterns in data.  In your case, you provided a standard financial analysis scenario, and data mining can provide information on the transaction patterns and transactors (people who purchase, perhaps vendors or customers). 

     

    Microsoft data mining has come to mean the specific algorithms in the Microsoft SQL Server software.  Some of the ingredient algorithms have been variously defined under predictive analytics or statistical analysis or even multivariate analysis (the term I used in graduate school).  To many people, data mining implies either automated or semi-automatic assistance by computer servers, where the software intelligently searches for better models.  For Microsoft, I recommend only looking at the versions related to SQL Server 2005 and 2008, and the latter is better.

     

    As a starting point, I would like you to start looking at the Excel interface to SQL Server.   That technology requires Excel 2007 and access to SQL Server 2005 or 2008.  You will need to work with a SQL Server expert to best set up a working demo or production environment.  The price paid in having such an expert is required because these tools were made for production high-volume data mining

     

    There are many data mining products available (free and otherwise).  The advantages of the Microsoft Data Mining tools include:

    • Tight implementation with a world-class database (SQL Server) and therefore leveraging the performance, security, and optimization features of this database platform.  This feature is important because Microsoft has created one of the world-leading databases.
    • Programmability through the developed languages -- meaning that you can have a team of developers integrate data mining into your current business intelligence solution.  The integration works best with Windows, but because the interface can be web-based, you need not be in a Windows-based BI solution to integrate these tools.  This feature is important because Microsoft is a creator of languages.
    • Production-quality use and output.  You can return to this forum and ask challenging production-related questions.  I don't see many posted these days, because (in my opinion) data modeling still carries this cultural norm of strong individuals leading the way (in old westerns, John Wayne, or in modern superheros, pick your favorite DC or Marvel character).  As we move forward, this technology already implies team ownership over data mining, and we can expect that future iterations will be thinking of teams working on projects.  Those newer cultural norms come from the SQL Server and business intelligence cultures, and I believe are a welcome addition to how statistical analysts have worked.  We need our heroes, but we also need teams too.

    Data mining is an active research field, and you could spend years reading peer-reviewed articles and textbooks on different aspects of the topic.  The field has been dominated by Ph.D. level people, and there's lots of careful thought behind the not only the algorithms but the statistical philosophies of analysis and synthesis.  Jamie provided a few links leading in those directions, and I would rather have you just study Microsoft Data Mining:

     

     

    Wednesday, September 24, 2008 4:43 AM
  • Thanks Mark and Jamie !!! A great post Mark !!
    Wednesday, September 24, 2008 7:53 AM