Comparison between Orange, R, Rapidminer, SAS and enterpriseminer.
-
Friday, June 15, 2012 1:20 AM
Hi,
May I know if there is any comparison between the data mining tools that I've mentioned above? I would want to know the differences between them all.
Thanks in advance
Regards,
KK
- Edited by Stfyou Friday, June 15, 2012 1:24 AM
All Replies
-
Friday, June 15, 2012 6:16 AMAnswerer
You can find information about popularity of different tools here http://www.kdnuggets.com/polls/2011/tools-analytics-data-mining.html
If you can describe what you would like to do, people might be able to recommend a better tool for your needs.
It is difficult to describe differences between them all because there are really many factors to consider.
Tatyana Yakushev [PredixionSoftware.com]
-
Friday, June 15, 2012 6:52 AM
which software provides better features in terms of 1)scalability and 2)power and flexibility, 3)how well the tools access and manage the data, 4) which is more graphical user friendly as well as 5) visualization.
I've done some research and I found out that rapidminer is better than the other 3 softwares.
-
Friday, June 15, 2012 4:38 PMAnswerer
1. What scalability are you looking for? Do you want to create few models on terabytes of data or do you expect to create large number of medium size models?
2. RapidMiner, R are good.
3. What do you mean by "manage the data"? Machine learning packages typically don't "manage data". Are you looking for tools that are good at data preparation (clearning, sampling etc)?
4. My subjective opinion is that Microsoft SQL Server add-ins for Excel and Predixion Insight have the easiest to use GUI while still being very powerful.
5. Predixion Insight. Check this out http://www.youtube.com/watch?v=rgz3LZRS_AY
How important is each of those 5 items for you?
Are you choosing a tool for your scientific research or for your business? Are you limited by budget, choice of operating system etc?
Tatyana Yakushev [PredixionSoftware.com]
-
Saturday, June 16, 2012 1:37 PM
i'm currently evaluating the 4 softwares listed above for business.
1) scalability in supporting for multiple user access and which can support for mining very large databases?
2) yes. softwares that are better at data preparation(cleansing, sampling etc). and i which to know which softwares can pass rules directly to OLAP tools and receive data for mining from OLAP tools.
3) also, which softwares can direct acess to database or need to extract sampling or can do both? and which software can direct access to warehouse?
4) which software have better size constraints in terms of maximum number of rows or records?
-
Tuesday, June 19, 2012 1:39 AMAnswerer
I am not familiar with all the software packages listed above to give you competent answers. You should ask questions about SAS, R, etc at different discussion boards. I can only give you answers about Predixion Insight.
1. Predixion server is implemented using cloud architecture, so to handle requests from more users you just need to add more machines to the cloud. Predixion controller (which is part of predixion server) does the load balancing. You can either use cloud hosted by Predixion or set up your own cloud (most Predixion customers choose to use their own cloud).
Predixion server currently works on top of Microsoft Analysis Services, which can create models on data that has <~1M rows . Predixion software is working on making it work on top of other data mining platforms. When using Mahout (data mining package for Hadoop) it can create models on any size of data. (You can read press release here)
2. I am not sure which software is the best for data preparation. Predixion Insight has basic data exploration and cleaning tasks but SAS and R have more.
3. Predixion Insight can consume data from PowerPivot. PowerPivot extracts data from any database that has OLEDB or ODBC provider. Future versions of Predixion insight will be able to work with data in warehouse without extracting it. Predixion has already done this for HIVE(Hadoop) and Greenplum.
4. Many data mining vendors are currently working to make their algorithms work on "big data". Currently, different vendors have different set of algorithms modified to work on big data. Do you know what algorithms you would like to work on "big data"?
Tatyana Yakushev [PredixionSoftware.com]
- Proposed As Answer by Eileen ZhaoMicrosoft Contingent Staff, Moderator Friday, June 22, 2012 6:34 AM
- Marked As Answer by Eileen ZhaoMicrosoft Contingent Staff, Moderator Tuesday, July 03, 2012 9:58 AM

