Data motion - how to avoid pre-deployment scripts for multiple versions conflicting with each other


  • Hi there,

    We are looking at using the approach outlined by Barclay Hill to handle data motion as our DB schema evolves, here is part 1 and 2.

    This looks excellent, but leaves me with a burning question I was hoping someone here could help me with.  Barclay doesn't seem to have been active since 2011 otherwise I would have commented on the blog.

    Basically, he describes a really robust method for create pre/post deployment scripts such that:
    a) They are self-contained and re-runnable.
    b) They can live in the project permanently for the reference of all developers on the project.
    c) They can be used to reliably upgrade any existing "version" of the database up to the latest.

    In his example, he wants to make a breaking change to the Customers table in version 3 which involves backing up the Customers table in pre-deploy, then deleting the data.  This allows the schema change to apply without raising errors about data loss, because he already deleted the data!  In post-deploy, he then copies the data back into the new Customers table, massaging where necessary.

    I can see how this all works nicely, but my question is, what if we need to make a breaking change to the Customers table in version 4??  In this hypothetical scenario, my version 4 pre-deploy script will be running AFTER the version 3 pre-deploy, but BEFORE the schema change and BEFORE the version 3 post-deploy. Essentially, there is already a half-complete data motion process for Customers and it seems to me that there's no simple way to handle this by adding additional pre- and post-deployment scripts.

    For the version 4 pre-deployment process, I can't be satisfied that the copy of Customers taken during version 3 pre-deployment is complete because it omits columns added during version 3 deployment.  I can't take another copy of Customers because it was just cleared during version 3 pre-deployment!

    The only way I can think of to address this when adding a version 4 breaking change to Customers is to go back and re-visit the version 3 pre-deployment script and make different decisions about what happens, e.g. back up additional columns.  This then puts me in the unconmfortable position of changing old scripts which have been tested then executed in production, introducing the risk of me breaking something that was previously working.

    Any thoughts would be most welcome.



    Wednesday, December 12, 2012 10:52 PM

All replies

  • I'd recommend taking a look at build/deployment contributors.  They provide the ability to modify the SSDT build and deployment pipeline directly.

    Monday, September 02, 2013 9:24 PM
  • Thanks psirr,

    I should have come back to this post earlier.  My thinking has moved on a lot since I posted it.

    In the interim we've set up a complete deployment pipeline that incorporates Migrator.NET for deploying schema changes and managing data motion.  This is actually working really well for us so I'm reluctant to use SSDT database projects as part of our ongoing development for the moment.

    I like its ease of integration into source control, the ability to treat the database schema declaratively, and the tools for scripting up data.

    However, for our sort of long-lived, single live database instance project, the deployment mechanism just doesn't seem to fit for live releases.  I never got to the point of trying to get SSDT projects building/deploying from the build server, but I read others posting that it's a headache without a full-scale install of Visual Studio on the build server.

    Migrator.NET seems to be very well suited to our deployment pipeline, we've got full control over its own source code, and it has a very light footprint.  I note that Entity Framework supports migrations, so as we embrace MVC more fully, EF migrations are probably a no-brainer for us.

    Where SSDT does come in handy for us is more for infrastructure/DBA type operations, like comparing instances of database schemas, setting up an empty shell database, or sucking data out into scripts.

    If/when it gets good support for the generation of test data we may look at it again - but we would want this to have a similarly light footprint - easily powershell scriptable etc.



    Thursday, September 12, 2013 1:09 AM