Friday, November 23, 2012 11:34 PM
I've seen people referring to Hadoop as the "Distributed Data processing" layer and " Distributed data storage" layer. My question - does it (or rather will it?) replace some/all of datawarehouse/SSIS projects? Your thoughts please? please think about scenario's that are two-three years down the line..
- Changed Type Eileen ZhaoMicrosoft Contingent Staff, Moderator Friday, November 30, 2012 7:30 AM
Saturday, November 24, 2012 6:39 AM
Hadoop is for processing huge (petabytes) unstructured data sources, like web logs, click stream, etc.
It is not so good for processing structured data from OLTP systems.
With the introduction of PolyBase (http://www.zdnet.com/microsofts-polybase-mashes-up-sql-server-and-hadoop-7000007424/) Hadoop becomes just another part of the SQL Server an we will use it as just another part of the warehouse. Depending on the structure of the data source we will use hadoop and old fashioned approach to gather source data.
- Proposed As Answer by Mudassar_M Saturday, November 24, 2012 7:49 PM
Sunday, November 25, 2012 4:31 AM
Thanks for the insightful reply; this is helpful..... keeping the thread open to see what others think about it as wel.....