none
ADLA Databases for multi-tenant question

    General discussion

  • I wanted to get some opinions on how to "best" accommodate a Data Lake design for handling multiple (100 or so) customers with regard to Databases created.  Each customer has custom procedures, functions, views etc. that have been created for them to process their (mostly) unique data files and workflows.  Each customer has their own blob storage container for uploading data which starts the whole process of running U-SQL jobs.

    Question:  would it be better to create an ADLA database per customer (I have created some Schemas to logically organize their data domains), OR is it best to create just ONE ADLA database and create a Schema for each customer?

    If one ADLA database, are there performance, security or other concerns I would need to address?

    Any thoughts, opinions or experiences would be greatly appreciated!


    Bill Blakey

    Tuesday, September 25, 2018 4:13 PM

All replies

  • If it was the same ADLA account and schemas how would you ensure that any one customers data could not be referenced and served by another ?

    • Edited by iahe Thursday, September 27, 2018 10:45 AM
    Thursday, September 27, 2018 10:43 AM
  • Well, data (and resource) access is controlled programmatically thru our API.  So we would either be in context of one database (in the case of one db per customer) or one schema (one multi-tenant shared db) for the customer.  Security and data segmentation is not of concern to us.

    From a physical implementation standpoint, is there a disadvantage in performance, maintainability, etc. to go with one approach vs. the other?


    Bill Blakey

    Thursday, September 27, 2018 3:57 PM
  • It has been a few months since I posted this and wanted to revive this multi-tenant question in case some Microsoft folks could provide some guidance, etc.  I only have 10-12 customers on-board at this point, expected to be > 300 or more.  In the Central US Region we've got a ticket open on slooooow and erratic ADLA job performance times, sometime 20+ minutes to Deploy a 4 or 5 procedure/TV Function database.  Having a 30-50 minute deployment time for all of the ADLA databases is not good (hopefully this will be resolved soon by Microsoft).

    Microsoft folks, what are any known limits to:

    a) number of databases in a single ADLA?

    b) number of compiled objects (procedures, tvfs, views, etc) per Database?

    c) I plan to begin using External Tables with Azure SQL DB, any caveats there from ADLA if ONE multi-tenant database vs. one per each?

    Again, data files are stored in different Blob containers and similarly different root paths in ADLS, and access is controlled via our API.

    Thanks!


    Bill Blakey

    Monday, February 11, 2019 6:12 PM