locked
Improving Performance of large table RRS feed

  • Question

  • Hi,

    I need inputs in improving query performance.

    The query will read a staging table and dynamically generate insert scripts for each field (which will join corresponding table and finds if the value is different in staging table) and inserts changed into a temp table.

    The staging tables are huge close to 100 millions records and differences 

    Below is query:

    DECLARE @sourcename VARCHAR(500) = 'ABC,BCD,CDA,AB,DCA,OCIM,LRS'
    			
    --Converting comma parameter value into list of values
    SET @sourcename = '''' + REPLACE(@sourcename,',',''',''') + ''''
    
    IF EXISTS (SELECT Name FROM sysindexes WHERE Name = 'ix_STG_DM_STG_ACCOUNTUDP_SI')
    BEGIN
    	DROP INDEX ix_STG_DM_STG_ACCOUNTUDP_SI ON STG.DM_STG_ACCOUNTUDP_SI
    END
    
    CREATE CLUSTERED INDEX ix_STG_DM_STG_ACCOUNTUDP_SI ON STG.DM_STG_ACCOUNTUDP_SI(AccountId,UDPNAME,UDPFieldName)
    			
    --Getting the list of UDP and field list combination which is in the DM staging table into a temp table
    IF OBJECT_ID('TEMPDB..#STG_ACCNTSI_UDP_FIELD_LIST') IS NOT NULL
    BEGIN
    	DROP TABLE #STG_ACCNTSI_UDP_FIELD_LIST
    END
    CREATE TABLE #STG_ACCNTSI_UDP_FIELD_LIST(
    	ID INT IDENTITY(1,1),
    	UDPNAME NVARCHAR(128),
    	UDPFieldName NVARCHAR(128),
    	UDPFieldDatatype NVARCHAR(30)
    )
    DECLARE @TEMP VARCHAR(MAX)
    SELECT @TEMP = 
    'INSERT INTO #STG_ACCNTSI_UDP_FIELD_LIST(' + CHAR(13) + CHAR(10) +
    '	UDPNAME,' + CHAR(13) + CHAR(10) +
    '	UDPFieldName,' + CHAR(13) + CHAR(10) +
    '	UDPFieldDatatype' + CHAR(13) + CHAR(10) +
    ')' + CHAR(13) + CHAR(10) +
    'SELECT DISTINCT STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype' + CHAR(13) + CHAR(10) +
    'FROM STG.DM_STG_ACCOUNTUDP_SI STG' + CHAR(13) + CHAR(10) +
    'INNER JOIN agncy_dfnd_tbl ADT ON STG.UDPNAME = ADT.agncy_dfnd_tbl_nm' + CHAR(13) + CHAR(10) +
    'INNER JOIN agncy_dfnd_clmn ADC ON STG.UDPFieldName = ADC.agncy_dfnd_clmn_nm' + CHAR(13) + CHAR(10) +
    '	AND ADT.agncy_dfnd_tbl_id = ADC.agncy_dfnd_tbl_id' + CHAR(13) + CHAR(10) +
    'WHERE STG.Source IN (' + @sourcename + ')'
    
    EXEC(@TEMP)
    
    --Temp table to hold the DM data for the accounts for which the UDP exists
    IF OBJECT_ID('TEMPDB..#dminputdataasi') IS NOT NULL
    BEGIN
    	DROP TABLE #dminputdataasi
    END
    CREATE TABLE #dminputdataasi(
    	AccountId NVARCHAR(20) NOT NULL,
    	Source NVARCHAR(20) NOT NULL,
    	UDPName NVARCHAR(100) NOT NULL,
    	UDPFieldName NVARCHAR(100) NOT NULL,
    	UDPFieldDatatype NVARCHAR(30) NOT NULL,
    	UDPValue NVARCHAR(MAX)
    )
    
    CREATE CLUSTERED INDEX ix_dminputdataasi ON #dminputdataasi(AccountId,UDPName,UDPFieldName)
    
    DECLARE @START INT
    DECLARE @END INT
    DECLARE @SQL NVARCHAR(MAX)
    DECLARE @UDPNAME NVARCHAR(128)
    DECLARE @UDPFieldName NVARCHAR(128)
    DECLARE @UDPFieldDatatype NVARCHAR(30)
    
    SET @START = 1
    SELECT @END = COUNT(*) FROM #STG_ACCNTSI_UDP_FIELD_LIST
    
    --For every record in the #STG_ACCNTSI_UDP_FIELD_LIST, this script will get the value for the account that is in DM staging and the UDP field exists in DM and load into temp table dynamically
    --For LOV field this will join with udf_val_lst and get the actual value for comparision against the DM staging table
    WHILE(@START <= @END)
    BEGIN
    	SELECT @UDPNAME = UDPNAME,
    		@UDPFieldName = UDPFieldName,
    		@UDPFieldDatatype = UDPFieldDatatype
    	FROM #STG_ACCNTSI_UDP_FIELD_LIST WHERE ID = @START
    
    	IF(@UDPFieldDatatype <> 'lov')
    	BEGIN	
    		SET @SQL = '' + 
    			'INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)' + CHAR(13) + CHAR(10) +
    			'SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,' + CHAR(13) + CHAR(10) +
    			CASE WHEN @UDPFieldDatatype = 'text_type' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(MAX))'
    				WHEN @UDPFieldDatatype = 'date_type' THEN 'CONVERT(NVARCHAR(10),UDP.' + @UDPFieldName + ',120)'
    				WHEN @UDPFieldDatatype = 'decimal_type' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(50))'
    				WHEN @UDPFieldDatatype = 'integer_type' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(50))'
    				WHEN @UDPFieldDatatype = 'biginteger_type' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(50))'
    				WHEN @UDPFieldDatatype = 'boolean' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(50))'
    				WHEN @UDPFieldDatatype = 'percentage' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(50))'
    				WHEN @UDPFieldDatatype = 'lov' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(100))'
    				WHEN @UDPFieldDatatype = 'OTHER' THEN 'CAST(UDP.' + @UDPFieldName + ' AS NVARCHAR(100))'
    			END + CHAR(13) + CHAR(10) +
    			'FROM STG.DM_STG_ACCOUNTUDP_SI STG' + CHAR(13) + CHAR(10) +
    			'INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt' + CHAR(13) + CHAR(10) +
    			'INNER JOIN ' + @UDPNAME + ' UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id' + CHAR(13) + CHAR(10) +
    			'WHERE STG.UDPNAME = ''' + @UDPNAME + ''''  + CHAR(13) + CHAR(10) +
    			'AND STG.UDPFieldName = ''' + @UDPFieldName + ''''  + CHAR(13) + CHAR(10) +
    			'AND STG.Source IN (' + @sourcename + ')' + CHAR(13) + CHAR(10) + ' AND ' +
    			CASE WHEN @UDPFieldDatatype = 'text_type' THEN 'ISNULL(' + @UDPFieldName + ','''')'
    				WHEN @UDPFieldDatatype = 'date_type' THEN 'ISNULL(' + @UDPFieldName + ',''1900-01-01'')'
    				WHEN @UDPFieldDatatype = 'decimal_type' THEN 'ISNULL(' + @UDPFieldName + ',0.00)'
    				WHEN @UDPFieldDatatype = 'integer_type' THEN 'ISNULL(' + @UDPFieldName + ',-1)'
    				WHEN @UDPFieldDatatype = 'biginteger_type' THEN 'ISNULL(' + @UDPFieldName + ',0)'
    				WHEN @UDPFieldDatatype = 'boolean' THEN 'ISNULL(' + @UDPFieldName + ',''N'')'
    				WHEN @UDPFieldDatatype = 'percentage' THEN 'ISNULL(' + @UDPFieldName + ',-1.00)'
    				WHEN @UDPFieldDatatype = 'lov' THEN 'ISNULL(' + @UDPFieldName + ','''')'
    				WHEN @UDPFieldDatatype = 'OTHER' THEN 'ISNULL(' + @UDPFieldName + ',-1)'
    			END + ' <> ' +
    			CASE WHEN @UDPFieldDatatype = 'text_type' THEN 'ISNULL(UDPTEXTValue,'''')'
    				WHEN @UDPFieldDatatype = 'date_type' THEN 'ISNULL(UDPDATEValue,''1900-01-01'')'
    				WHEN @UDPFieldDatatype = 'decimal_type' THEN 'ISNULL(UDPDECIMALValue,0.00)'
    				WHEN @UDPFieldDatatype = 'integer_type' THEN 'ISNULL(UDPINTValue,-1)'
    				WHEN @UDPFieldDatatype = 'biginteger_type' THEN 'ISNULL(UDPBIGINTValue,0)'
    				WHEN @UDPFieldDatatype = 'boolean' THEN 'ISNULL(UDPBOOLEANValue,''N'')'
    				WHEN @UDPFieldDatatype = 'percentage' THEN 'ISNULL(UDPPERCENTAGEValue,-1.00)'
    				WHEN @UDPFieldDatatype = 'lov' THEN 'ISNULL(UDPLOVValue,'''')'
    				WHEN @UDPFieldDatatype = 'OTHER' THEN 'ISNULL(UDPOTHER,-1)'
    			END
    
    	END
    
    	IF(@UDPFieldDatatype = 'lov')
    	BEGIN	
    		SET @SQL = '' + 
    			'INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)' + CHAR(13) + CHAR(10) +
    			'SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,' + CHAR(13) + CHAR(10) +
    			'uv1.udf_val_txt AS UDPfieldvalue' + CHAR(13) + CHAR(10) +
    			'FROM STG.DM_STG_ACCOUNTUDP_SI STG' + CHAR(13) + CHAR(10) +
    			'INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt' + CHAR(13) + CHAR(10) +
    			'INNER JOIN ' + @UDPNAME + ' UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id' + CHAR(13) + CHAR(10) +
    			'INNER JOIN udf_val_lst uv1 ON UDP.' + @UDPFieldName + ' = uv1.udf_val_list_id' + CHAR(13) + CHAR(10) +
    			'WHERE STG.UDPNAME = ''' + @UDPNAME + ''''  + CHAR(13) + CHAR(10) +
    			'AND STG.UDPFieldName = ''' + @UDPFieldName + ''''  + CHAR(13) + CHAR(10) +
    			'AND STG.Source IN (' + @sourcename + ')' + CHAR(13) + CHAR(10) +
    			'AND ISNULL(' + @UDPFieldName + ',0) <> ISNULL(udf_val_list_id,0) '
    	END
    	EXEC(@SQL)
    
    	SET @START = @START + 1
    END

    Which will generate query like this and inserts:

    INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)
    SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,
    CAST(UDP.UDEFAccount_Name AS NVARCHAR(MAX))
    FROM STG.DM_STG_ACCOUNTUDP_SI STG
    INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt
    INNER JOIN UDP1 UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id
    WHERE STG.UDPNAME = 'UDP1'
    AND STG.UDPFieldName = 'UDEFAccount_Name'
    AND STG.Source IN ('ABC','BCD','CDA','AB','DCA','OCIM','LRS')
    AND ISNULL(UDEFAccount_Name,'') <> ISNULL(UDPTEXTValue,'')
    INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)
    SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,
    CONVERT(NVARCHAR(10),UDP.UDEFAccount_Open_Date,120)
    FROM STG.DM_STG_ACCOUNTUDP_SI STG
    INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt
    INNER JOIN UDP1 UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id
    WHERE STG.UDPNAME = 'UDP1'
    AND STG.UDPFieldName = 'UDEFAccount_Open_Date'
    AND STG.Source IN ('ABC','BCD','CDA','AB','DCA','OCIM','LRS')
    AND ISNULL(UDEFAccount_Open_Date,'1900-01-01') <> ISNULL(UDPDATEValue,'1900-01-01')
    INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)
    SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,
    CAST(UDP.UDEFAllocation_code AS NVARCHAR(MAX))
    FROM STG.DM_STG_ACCOUNTUDP_SI STG
    INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt
    INNER JOIN UDP1 UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id
    WHERE STG.UDPNAME = 'UDP1'
    AND STG.UDPFieldName = 'UDEFAllocation_code'
    AND STG.Source IN ('ABC','BCD','CDA','AB','DCA','OCIM','LRS')
    AND ISNULL(UDEFAllocation_code,'') <> ISNULL(UDPTEXTValue,'')
    INSERT INTO #dminputdataasi(AccountId,Source,UDPName,UDPFieldName,UDPFieldDatatype,UDPValue)
    SELECT CA.cnsmr_accnt_idntfr_lgcy_txt,STG.Source,STG.UDPNAME,STG.UDPFieldName,STG.UDPFieldDatatype,
    CAST(UDP.UDEFFee_Due AS NVARCHAR(50))
    FROM STG.DM_STG_ACCOUNTUDP_SI STG
    INNER JOIN cnsmr_accnt CA ON STG.AccountID = CA.cnsmr_accnt_idntfr_lgcy_txt
    INNER JOIN UDP2 UDP ON CA.cnsmr_accnt_id = UDP.cnsmr_accnt_id
    WHERE STG.UDPNAME = 'UDP2'
    AND STG.UDPFieldName = 'UDEFFee_Due'
    AND STG.Source IN ('ABC','BCD','CDA','AB','DCA','OCIM','LRS')
    AND ISNULL(UDEFFee_Due,0.00) <> ISNULL(UDPDECIMALValue,0.00)

    Appreciate your inputs.

    Regards,

    Ram.

    • Edited by Eswararao C Wednesday, October 23, 2019 9:49 AM
    Wednesday, October 23, 2019 8:28 AM

All replies

  • Any Inputs on the performance improvement?

    All the join tables have correct indexing, I am trying to generate dynamic query for each table and field combination in the staging table and compare against the actual table/field (which is in stg table) and find the difference and insert into temp table.

    Regards,

    Ram.

    • Edited by Eswararao C Thursday, October 24, 2019 3:27 AM
    Thursday, October 24, 2019 3:27 AM
  • You didn't specify what is slow - the insert, the select, etc.  Take out the INSERT statement, and just run the SELECT statement.  Is that fast?  If so, then I would guess your tempdb is autogrowing.  If the SELECT statement is slow, then you need to look at query plans.  

    Thursday, October 24, 2019 5:57 AM
  • Hi Eswararao C,

    Thank you for your issue .

    Firstly , please try to execute SELECT query in SSMS and check the performance . Or could you please share us execution plan?In SELECT clause , you use many INNER JOIN and 'AND STG.Source IN ('ABC','BCD','CDA','AB','DCA','OCIM','LRS') AND ISNULL(UDEFAllocation_code,'') <> ISNULL(UDPTEXTValue,'')'. These conditions might not use index . They might cause poor performance.

    And then , you INSERT the result into #dminputdataasi. I mentioned that there is an index in table #dminputdataasi, it also might cause poor performance. 

    Could you please share us more information? By the way , without your actual environment, I could not provide you a correct solution ,so I will give you some advice . Please try.

    Best Regards,

    Rachel 


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, October 24, 2019 9:36 AM