I've seen people referring to Hadoop as the "Distributed Data processing" layer and " Distributed data storage" layer. My question - does it (or rather will it?) replace some/all of datawarehouse/SSIS projects? Your thoughts please? please think about scenario's that are two-three years down the line..
- Changed type Eileen Zhao Friday, November 30, 2012 7:30 AM
Hadoop is for processing huge (petabytes) unstructured data sources, like web logs, click stream, etc.
It is not so good for processing structured data from OLTP systems.
With the introduction of PolyBase (http://www.zdnet.com/microsofts-polybase-mashes-up-sql-server-and-hadoop-7000007424/) Hadoop becomes just another part of the SQL Server an we will use it as just another part of the warehouse. Depending on the structure of the data source we will use hadoop and old fashioned approach to gather source data.
- Proposed as answer by Mudassar_M Saturday, November 24, 2012 7:49 PM