locked
Auto-generated INT IDs on a column in Azure SQL Data Warehouse RRS feed

  • Question

  • Hi there,

    I am new to Azure SQL Data Warehouse and I know that IDENTITY is not supported in Azure SQL Data Warehouse, but is there another way to automatically assign an INT value to a column on insert in the same manner? 

    Kind regards,

    Sasha

    Tuesday, October 27, 2015 5:24 PM

Answers

  • Hi Sasha

    yes there is a work around, what you can do is begin by getting the Max RowId from the table, then in the select statement from you source use a row_number() statement to generate the row number, then simply add the MaxRowId to the row_number to create sequential series of RowId's,

    DECLARE @MaxRowId int
    SET @MaxRowId = (SELECT ISNULL(MAX(RowId),0) FROM dbo.MyTable)
    
    
    INSERT INTO MyTable (RowId, col1, col2)
    SELECT
      ROW_NUMBER() OVER(ORDER BY [col1]) + @MaxRowId RowId
      , [col1]
      , [col2]
    FROM dbo.MySourceTable


    Tuesday, October 27, 2015 6:50 PM

All replies

  • Hi Sasha

    yes there is a work around, what you can do is begin by getting the Max RowId from the table, then in the select statement from you source use a row_number() statement to generate the row number, then simply add the MaxRowId to the row_number to create sequential series of RowId's,

    DECLARE @MaxRowId int
    SET @MaxRowId = (SELECT ISNULL(MAX(RowId),0) FROM dbo.MyTable)
    
    
    INSERT INTO MyTable (RowId, col1, col2)
    SELECT
      ROW_NUMBER() OVER(ORDER BY [col1]) + @MaxRowId RowId
      , [col1]
      , [col2]
    FROM dbo.MySourceTable


    Tuesday, October 27, 2015 6:50 PM
  • Hi Michael,

    Thank you for your reply. But wouldn't this approach cause problems during simultaneous inserts by different processes? In addition, selecting max ID prior to insert seems very inefficient to me.

    I think there must be some proper workaround and It would be good to know if this functionality is on the roadmap and if not, it would be really good to know why it isn't.

    Kind regards,

    Sasha

    Wednesday, October 28, 2015 10:30 AM
  • Hi Sasha

    Unfortunately the internal mechanics of having multiple underlying databases (massively multi processor(MMP)) makes supporting something like an identity or even foreign keys incredibly complex which is why we do not see it supported in the product yet and possibly never will.

    Fortunately data warehouse loads are often performed in coordinated batch processes so the approach describe above could be used. If this is not the case for you then you may have to generate the key outside of the data warehouse or use something other than identity keys. 


    Wednesday, October 28, 2015 11:13 AM
  • Hi Michael,

    Thank you for your clarification. One last question. Is it possible to link another Azure SQL Server instance to Azure SQL Data Warehouse to perform queries against that instance?

    Kind regards

    Sasha

    Friday, October 30, 2015 1:08 PM