none
SQL: How to extract table names from a SQL query in a column

    Question

  • Hi,

    We have the severak select statements captured in a column of a table. These select statements have several join statements and subqueries in it and there are about 3000 records in it. I want to extract the table names from these select statements.

    Here is an example.

    select a1, a2, a3

    from

    xyz a1

    join abc a2 on (a1.a1=a2.a2)

    join abcd a3 on (a1.a1=a3.a1)

    where a2.col1='123' and a1.col3 in (select col1 from tbl)

     

    Now I want to extract a1, a2, a3 and tbl.

    Any help is appreciated!

    Thanks,

    Suresh.

    Tuesday, August 17, 2010 8:57 PM

Answers

  • Here is a fully functional solution using the FMTONLY option I spoke about.

    USE [AdventureWorks]
    GO
    
    SET NOCOUNT ON;
    GO
    
    ----Create a sample table to simulate a table with TSQL Commands
    --drop table dbo.TSQL_Commands
    CREATE TABLE dbo.TSQL_Commands(Id INT,ObjName VARCHAR(100), tsql_stmt NVARCHAR(MAX));
    INSERT INTO dbo.TSQL_Commands VALUES (1,NULL,'SELECT *
    FROM person.contact c
    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]
    WHERE [ContactID] < 50');
    INSERT INTO dbo.TSQL_Commands VALUES (2,NULL,'SELECT *
    FROM person.contact c
    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]');
    INSERT INTO dbo.TSQL_Commands VALUES (3,NULL,'SELECT *
    FROM [Sales].[SalesOrderHeader] sh
    INNER JOIN sales.[SalesOrderDetail] sd ON [sh].[SalesOrderID] = [sd].[SalesOrderID]
    INNER JOIN sales.[SalesOrderHeaderSalesReason] ON [sd].[SalesOrderID] = [SalesOrderHeaderSalesReason].[SalesOrderID]')
    INSERT INTO dbo.TSQL_Commands VALUES (4,'dbo.uspGetEmployeeManagers','EXECUTE [dbo].[uspGetEmployeeManagers] @EmployeeID=50');
    INSERT INTO dbo.TSQL_Commands VALUES (5,'[dbo].[uspGetBillOfMaterials]','EXECUTE [dbo].[uspGetBillOfMaterials] @StartProductID=500,@CheckDate=''20090203''')
    INSERT INTO dbo.TSQL_Commands VALUES (6,'dbo.vw_test','select * from vw_test')
    GO
    
    --Cursor to execute dynamic sql with fmtonly, so the stmt is not actually executed, but a query plan is built
    DECLARE @sql NVARCHAR(MAX)
    
    DECLARE curExecDynSQL CURSOR LOCAL STATIC FOR
    SELECT tsql_stmt
    FROM dbo.TSQL_Commands
    
    OPEN curExecDynSQL
    FETCH NEXT FROM curExecDynSQL INTO @sql
    
    WHILE @@FETCH_STATUS = 0
      BEGIN
    		--PRINT @sql
    		SET FMTONLY ON;
    		EXEC sp_executesql @sql
    		SET FMTONLY OFF;
    		
    		FETCH NEXT FROM curExecDynSQL INTO @sql
      END
    
    CLOSE curExecDynSQL
    DEALLOCATE curExecDynSQL
    GO
    
    --Cache query to extract tables names from the cached execution plan
    ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
    SELECT 
    	[id],
    	(SELECT [processing-instruction(x)]=[Stmt] FOR XML PATH(''),TYPE) AS [Stmt],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias]
    FROM(
    SELECT
    	tsql_st.id,
    	SUBSTRING(
    		st.text,
    		(qs.statement_start_offset/2)+1, 
    		(
    		 (
    		 CASE qs.statement_end_offset
    			WHEN -1 THEN DATALENGTH(st.text)
    		 ELSE qs.statement_end_offset
    		 END - qs.statement_start_offset)/2
    		 ) 
    		 + 1) AS Stmt,
    	x.i.value('(OutputList/ColumnReference/@Database)[1]', 'VARCHAR(100)')AS [db],
    	x.i.value('(OutputList/ColumnReference/@Schema)[1]', 'VARCHAR(100)')AS [Schema],
    	x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)')AS [Tbl],
    	x.i.value('(OutputList/ColumnReference/@Alias)[1]', 'VARCHAR(100)')AS [Alias]
    FROM sys.dm_exec_query_stats qs
    CROSS APPLY sys.dm_exec_sql_text([sql_handle]) st
    CROSS APPLY sys.dm_exec_query_plan([plan_handle]) qp
    CROSS APPLY qp.query_plan.nodes('//RelOp') x(i)
    INNER JOIN dbo.TSQL_Commands tsql_st 
    	ON tsql_st.[tsql_stmt] COLLATE SQL_Latin1_General_CP1_CI_AS = 
    			SUBSTRING(st.text,(qs.statement_start_offset/2)+1,((
    					 CASE qs.statement_end_offset
    						WHEN -1 THEN DATALENGTH(st.text)
    					 ELSE qs.statement_end_offset
    					 END - qs.statement_start_offset)/2)+ 1) 
    			OR st.[objectid] = object_id(tsql_st.ObjName)
    WHERE 
    	x.i.value('@PhysicalOp', 'NVARCHAR(200)') IN('Table Scan','Index Scan','Clustered Index Scan','Index Seek','Clustered Index Seek')
    	AND EXISTS(
    		SELECT 1
    		FROM sys.tables t
    		WHERE 
    			t.name = REPLACE(REPLACE(x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)'),'[',''),']','')
    			AND T.TYPE = 'U'
    			AND T.is_ms_shipped = 0
    		)
    ) AS x
    GROUP BY 
    	[id],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias],
    	[Stmt]
    ORDER BY [id] ASC
    • Edited by Adam HainesModerator Wednesday, August 18, 2010 2:05 PM modified code
    • Proposed as answer by Naomi NModerator Wednesday, August 18, 2010 2:12 PM
    • Marked as answer by KJian_ Tuesday, August 24, 2010 6:29 AM
    Wednesday, August 18, 2010 1:35 PM
    Moderator

All replies

  • Well, check

    http://www.sqlparser.com/

    http://3d2f.com/programs/32-941-general-sql-parser-download.shtml

    http://www.sqlusa.com/sqlformat/

    (Google on "SQL syntax parser")

     


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Tuesday, August 17, 2010 9:29 PM
    Moderator
  • That is not possible in TSQL. You would need to have a parser for that which is capable of knowing the model of the database. One option would be to use the parser from Visual Studio Database Projects, this is available along with Visual Studio. You would have to use the API and handcraft that for yourself.

    -Jens 


    Jens K. Suessmeyer http://blogs.msdn.com/Jenss
    Tuesday, August 17, 2010 9:37 PM
    Moderator
  • You can get a little creative with this.  There is no way you are going to parse this with TSQL successfully; however, a better idea is to make sure each of the commands is executed on the server.  You can then query the cache and pull out each table name which is already parsed.

    Edit: to make this solution even better you can use the set fmtonly on; option to each of your statements which tells SQL Server not to execute the code at all, but will build a query plan.

    Here is an example.

    --Create a sample table
    DECLARE @t TABLE(Id INT, tsql_stmt NVARCHAR(MAX));
    INSERT INTO @t VALUES (1,'SELECT *
    FROM person.contact c
    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]
    WHERE [ContactID] < 50');
    
    ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
    SELECT 
    	[id],
    	(SELECT [processing-instruction(x)]=[Stmt] FOR XML PATH(''),TYPE) AS [Stmt],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias]
    FROM(
    SELECT
    	t.id,
    	text AS Stmt,
    	x.i.value('(OutputList/ColumnReference/@Database)[1]', 'VARCHAR(100)')AS [db],
    	x.i.value('(OutputList/ColumnReference/@Schema)[1]', 'VARCHAR(100)')AS [Schema],
    	x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)')AS [Tbl],
    	x.i.value('(OutputList/ColumnReference/@Alias)[1]', 'VARCHAR(100)')AS [Alias]
    FROM sys.dm_exec_query_stats qs
    CROSS APPLY sys.dm_exec_sql_text([sql_handle]) st
    CROSS APPLY sys.dm_exec_query_plan([plan_handle]) qp
    CROSS APPLY qp.query_plan.nodes('//RelOp') x(i)
    INNER JOIN @t t ON t.[tsql_stmt] COLLATE SQL_Latin1_General_CP1_CI_AS = st.text
    WHERE 
    	x.i.value('@PhysicalOp', 'NVARCHAR(200)') IN('Table Scan','Index Scan','Clustered Index Scan','Index Seek','Clustered Index Seek')
    	AND EXISTS(
    		SELECT 1
    		FROM sys.tables t
    		WHERE 
    			t.name = REPLACE(REPLACE(x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)'),'[',''),']','')
    			AND T.TYPE = 'U'
    			AND T.is_ms_shipped = 0
    		)
    ) AS x
    GROUP BY 
    	[id],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias],
    	[Stmt]
    
    

    http://jahaines.blogspot.com/
    Tuesday, August 17, 2010 10:07 PM
    Moderator
  • Here is a fully functional solution using the FMTONLY option I spoke about.

    USE [AdventureWorks]
    GO
    
    SET NOCOUNT ON;
    GO
    
    ----Create a sample table to simulate a table with TSQL Commands
    --drop table dbo.TSQL_Commands
    CREATE TABLE dbo.TSQL_Commands(Id INT,ObjName VARCHAR(100), tsql_stmt NVARCHAR(MAX));
    INSERT INTO dbo.TSQL_Commands VALUES (1,NULL,'SELECT *
    FROM person.contact c
    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]
    WHERE [ContactID] < 50');
    INSERT INTO dbo.TSQL_Commands VALUES (2,NULL,'SELECT *
    FROM person.contact c
    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]');
    INSERT INTO dbo.TSQL_Commands VALUES (3,NULL,'SELECT *
    FROM [Sales].[SalesOrderHeader] sh
    INNER JOIN sales.[SalesOrderDetail] sd ON [sh].[SalesOrderID] = [sd].[SalesOrderID]
    INNER JOIN sales.[SalesOrderHeaderSalesReason] ON [sd].[SalesOrderID] = [SalesOrderHeaderSalesReason].[SalesOrderID]')
    INSERT INTO dbo.TSQL_Commands VALUES (4,'dbo.uspGetEmployeeManagers','EXECUTE [dbo].[uspGetEmployeeManagers] @EmployeeID=50');
    INSERT INTO dbo.TSQL_Commands VALUES (5,'[dbo].[uspGetBillOfMaterials]','EXECUTE [dbo].[uspGetBillOfMaterials] @StartProductID=500,@CheckDate=''20090203''')
    INSERT INTO dbo.TSQL_Commands VALUES (6,'dbo.vw_test','select * from vw_test')
    GO
    
    --Cursor to execute dynamic sql with fmtonly, so the stmt is not actually executed, but a query plan is built
    DECLARE @sql NVARCHAR(MAX)
    
    DECLARE curExecDynSQL CURSOR LOCAL STATIC FOR
    SELECT tsql_stmt
    FROM dbo.TSQL_Commands
    
    OPEN curExecDynSQL
    FETCH NEXT FROM curExecDynSQL INTO @sql
    
    WHILE @@FETCH_STATUS = 0
      BEGIN
    		--PRINT @sql
    		SET FMTONLY ON;
    		EXEC sp_executesql @sql
    		SET FMTONLY OFF;
    		
    		FETCH NEXT FROM curExecDynSQL INTO @sql
      END
    
    CLOSE curExecDynSQL
    DEALLOCATE curExecDynSQL
    GO
    
    --Cache query to extract tables names from the cached execution plan
    ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
    SELECT 
    	[id],
    	(SELECT [processing-instruction(x)]=[Stmt] FOR XML PATH(''),TYPE) AS [Stmt],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias]
    FROM(
    SELECT
    	tsql_st.id,
    	SUBSTRING(
    		st.text,
    		(qs.statement_start_offset/2)+1, 
    		(
    		 (
    		 CASE qs.statement_end_offset
    			WHEN -1 THEN DATALENGTH(st.text)
    		 ELSE qs.statement_end_offset
    		 END - qs.statement_start_offset)/2
    		 ) 
    		 + 1) AS Stmt,
    	x.i.value('(OutputList/ColumnReference/@Database)[1]', 'VARCHAR(100)')AS [db],
    	x.i.value('(OutputList/ColumnReference/@Schema)[1]', 'VARCHAR(100)')AS [Schema],
    	x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)')AS [Tbl],
    	x.i.value('(OutputList/ColumnReference/@Alias)[1]', 'VARCHAR(100)')AS [Alias]
    FROM sys.dm_exec_query_stats qs
    CROSS APPLY sys.dm_exec_sql_text([sql_handle]) st
    CROSS APPLY sys.dm_exec_query_plan([plan_handle]) qp
    CROSS APPLY qp.query_plan.nodes('//RelOp') x(i)
    INNER JOIN dbo.TSQL_Commands tsql_st 
    	ON tsql_st.[tsql_stmt] COLLATE SQL_Latin1_General_CP1_CI_AS = 
    			SUBSTRING(st.text,(qs.statement_start_offset/2)+1,((
    					 CASE qs.statement_end_offset
    						WHEN -1 THEN DATALENGTH(st.text)
    					 ELSE qs.statement_end_offset
    					 END - qs.statement_start_offset)/2)+ 1) 
    			OR st.[objectid] = object_id(tsql_st.ObjName)
    WHERE 
    	x.i.value('@PhysicalOp', 'NVARCHAR(200)') IN('Table Scan','Index Scan','Clustered Index Scan','Index Seek','Clustered Index Seek')
    	AND EXISTS(
    		SELECT 1
    		FROM sys.tables t
    		WHERE 
    			t.name = REPLACE(REPLACE(x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)'),'[',''),']','')
    			AND T.TYPE = 'U'
    			AND T.is_ms_shipped = 0
    		)
    ) AS x
    GROUP BY 
    	[id],
    	[db],
    	[Schema],
    	[Tbl],
    	[Alias],
    	[Stmt]
    ORDER BY [id] ASC
    • Edited by Adam HainesModerator Wednesday, August 18, 2010 2:05 PM modified code
    • Proposed as answer by Naomi NModerator Wednesday, August 18, 2010 2:12 PM
    • Marked as answer by KJian_ Tuesday, August 24, 2010 6:29 AM
    Wednesday, August 18, 2010 1:35 PM
    Moderator