locked
Data Obfuscation RRS feed

  • Question

  • Hi,

    is anyone using DBPro for Data Obfuscation?  ie scrambling / masking existing data, or generating large volumes of dummy data.

    I would like to hear about your experiences as we are looking at evaluating DBPro against an off-the-shelf Data Obfuscation tool.
    Monday, November 9, 2009 2:47 PM

Answers

  • The Data Generation tool can be used to generate "large amounts of dummy data."
    Duke Kamstra - Program Manager - VSTS Database Edition (Data Dude, DBPro)
    • Marked as answer by wBob Friday, November 13, 2009 11:07 AM
    Tuesday, November 10, 2009 8:48 PM
    Moderator
  • As mentioned the Data Generators can be used to do all that you're asking.  It is possible to scramble / de-identify production data with a generator.

    The risk with using production data is that you don't get all test cases, so you might want to consider analysing the problem and producing your own test data, then as a final stage, de-identifying the data and using that knowing that you've covered all bases?

    The standard generators are quite good, but it's pretty straight forward to write your own - just derive the class from a Generator class, then override the GetNextValues method and pretty much that's it.  I find that the standard generators aren't specific enough - you'll probably want generators for addresses possibly, or phone numbers, or names, that sort of thing where a random string would suffice for some of the data, but would be better to be a more specific case.

    Hope that helps,

    Martin.
    MCSD, MCTS, MCPD. Please mark my post as helpful if you find the information good! http://www.consultantvault.com
    • Marked as answer by wBob Friday, November 13, 2009 11:07 AM
    Friday, November 13, 2009 8:34 AM

All replies

  • The Data Generation tool can be used to generate "large amounts of dummy data."
    Duke Kamstra - Program Manager - VSTS Database Edition (Data Dude, DBPro)
    • Marked as answer by wBob Friday, November 13, 2009 11:07 AM
    Tuesday, November 10, 2009 8:48 PM
    Moderator
  • As mentioned the Data Generators can be used to do all that you're asking.  It is possible to scramble / de-identify production data with a generator.

    The risk with using production data is that you don't get all test cases, so you might want to consider analysing the problem and producing your own test data, then as a final stage, de-identifying the data and using that knowing that you've covered all bases?

    The standard generators are quite good, but it's pretty straight forward to write your own - just derive the class from a Generator class, then override the GetNextValues method and pretty much that's it.  I find that the standard generators aren't specific enough - you'll probably want generators for addresses possibly, or phone numbers, or names, that sort of thing where a random string would suffice for some of the data, but would be better to be a more specific case.

    Hope that helps,

    Martin.
    MCSD, MCTS, MCPD. Please mark my post as helpful if you find the information good! http://www.consultantvault.com
    • Marked as answer by wBob Friday, November 13, 2009 11:07 AM
    Friday, November 13, 2009 8:34 AM