none
ODS DB Design/Architecture - Data Loading and Retrieving Challenge RRS feed

  • Question

  • Good Afternoon,

    I'm trying to build a ODS (Operation Data Store) in SQL Sever and eventually it will be migrated to SQL Azure as a PaaS.

    I'm thinking of different options and wondering which one would be ideal from the performance and scalability stand point. Also, wondering if there are any other better design approaches to build this ODS.

    So basically, the ODS is about storing the previous day's customers loan information and currently there are 800k loan accounts across 180 clients. Data will be loaded into the ODS database on a nightly basis and the loading process should finish seemlessly as the customers queries will hit the database all the time and should be performant. Hypothetically, one query from each customer which means 800k queries at a point of time.

    Let's play with one table for now (ofcourse it's going to be some 40 tables). So the options I'm thinking are something like this:

    1. Sharding option - Create one database for each 10 clients loan info which means 18 databases for all 180 clients and maintain seperate instance for each database meaning multiple nodes. I'm just thinking out loud, don't even know if it's even worth for this amount of data.

    2. Partition option -  Create one database and table and partition the table either on ClientID, LoanNo or just on ClientID.  

    Challenges:

    1. Seemless loading process - So, with the above options I can prepare my nightly data in the staging table and once the data is ready I can just do the partition switch from staging to live tables.
    2. Data retrieving -  Always data will be retrieved per LoanNo and thinking to create a table like OLAP table. Meaning all the required fields per each application call should come from one table without doing any joins on the other tables.


    Please let me know if anything doesn't make a sense. Thanks in advance!

    Tuesday, August 13, 2019 6:16 PM

All replies