locked
Data Reconcilation and Audit in Azure Data Lake RRS feed

  • Question

  • Hello,

    i am working on ETL Project, where i have to clean data applying business rules.

    Rules Table look like :-

    RULES TABLE	
    UNIQUE_ID	RULES
    1	NULL CHECK
    2	LENGTH CHECK
    3	REGEX CHECK
    4	UNIQUE CHECK
    5	ONLY CHARACTER ALLOW

    DATA:

    DATA					
    COLA	COLB	COLC	COLD	COLE	COLF
    AX	A.3M4032X12.5MM	MP ADH.3M 4032 x 12.5mm	ZT	Y	0
    NULL	A.3M4032X12.5MM	MP ADH.3M 4032 x 12.5mm	ZT	Y	A
    AX	A.3M4936PX25MM	MP Adh.3M 4936P a 25mm	ZT	Y	0
    	A-K12216BC	NULL	05	N	0
    AX	A-1-1/16-B	A-1-1/16 END	05	Y	0

    i am trying to create a reconcilation audit table or JSON File which shows which business rule failed on which column. 

    Please help me what should be the structure of file or table, and how to implement it using SQL. 

    Thanks a lot techie, your suggestion is highly appreciated.


    • Edited by Jamiechales Thursday, October 31, 2019 3:46 PM
    Thursday, October 31, 2019 11:02 AM

All replies

  • ETL in ADF is done via dataflow.
    Thursday, October 31, 2019 2:42 PM
  • can i used SSIS to push data to Data Lake. The real challenge i am facing while designing Reconcilation Audit. i am clueless how to implement it. It would be very much helpful if you can share your expertise. i am free to use any tool or technology. 

    But as of now i am thinking to do it in either SSIS or python

    Thanks a ton dataflowuser.....


    • Edited by Jamiechales Thursday, October 31, 2019 3:45 PM added comment
    Thursday, October 31, 2019 3:44 PM
  • Hello Jamiechales.  While it may be possible to do those things in ADF, I agree it would probably be easier to do them in SSIS or python or Azure Databricks.

    Azure Databricks gives you the power of distributed computing, with the ability to do interactive python.  Databricks also can connect to many different stores.  I forget if that includes Data Lake gen1.

    Tuesday, November 5, 2019 1:40 AM