locked
How to create a test environment with just a subset of production data? RRS feed

  • Question

  • We have a Data warehouse server with some pretty large production databases on it.  Collectively about 12TB of data across ~10 databases.  What I would like to do is create a test environment of this but don't want all 12TB of data.  What I am looking for is a way to get a copy of all ~10 db's onto the new test server but only bring over say 25% of the data for testing purposes.  Is there a tool to help me do this and only pull out a portion of the records and not all of them without having to write a seperate query for every table in every database? 
    Friday, August 17, 2018 3:52 AM

Answers

  • Hi Jason,

    In your scenario, you want to copy 10 DBs to a new test environment server. However, you only want to copy 25% of the data of them since the original data in 10 DBs is about 12 TB. Right?

    Per my understanding to achieve your requirement. I think you can try this:

    1. Use Generate Scripts to copy 10 DBs to the new test environment.

    2. Use SSIS sampling transformation (Percentage Sampling Transformation) to copy the data from these 10 DBs. 

    The Percentage Sampling transformation creates a sample data set by selecting a percentage of the transformation input rows. The sample data set is a random selection of rows from the transformation input, to make the resultant sample representative of the input.

    For more details, please refer: Data Sampling in SQL Server Integration Services 

    https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/percentage-sampling-transformation?view=sql-server-2017 

    Thanks,
    Xi Jin.


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, August 17, 2018 6:36 AM

All replies

  • Hi Jason,

    In your scenario, you want to copy 10 DBs to a new test environment server. However, you only want to copy 25% of the data of them since the original data in 10 DBs is about 12 TB. Right?

    Per my understanding to achieve your requirement. I think you can try this:

    1. Use Generate Scripts to copy 10 DBs to the new test environment.

    2. Use SSIS sampling transformation (Percentage Sampling Transformation) to copy the data from these 10 DBs. 

    The Percentage Sampling transformation creates a sample data set by selecting a percentage of the transformation input rows. The sample data set is a random selection of rows from the transformation input, to make the resultant sample representative of the input.

    For more details, please refer: Data Sampling in SQL Server Integration Services 

    https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/percentage-sampling-transformation?view=sql-server-2017 

    Thanks,
    Xi Jin.


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, August 17, 2018 6:36 AM
  • That is correct. Thanks Xi.  I have not used/heard of that transform.  I will look into it, at inital glance it looks to be just what I need!

    Friday, August 17, 2018 2:47 PM
  • Hi Jason,

    If our replies are helpful to you, please kindly mark them as answers. By doing so, it will benefit all community members who are facing similar issues. Your contribution is highly appreciated.

    If you still have any question, please feel free to ask.

    Thanks,
    Xi Jin.


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    • Edited by Xi Jin Monday, August 20, 2018 4:54 AM
    Monday, August 20, 2018 1:22 AM