none
Replacing Multiple Strings Using the REPLACE Function RRS feed

  • Question

  • I'm would like to replace all occurrences of "99999" and "-99999" with "" in a column using SSIS.  I can use the REPLACE function in a Derived Column to replace one of those strings, for example: REPLACE(mycolumn,"99999","").  Or to replace both I could use REPLACE(REPLACE(mycolumn,"-99999",""),"99999","").  This seems kind of cumbersome and would get very complicated if I were replacing more strings with "".  I'm guessing there is a better way.  Can anyone help me out?

    Thanks,
    Ridium
    Monday, July 9, 2007 1:03 AM

Answers

  •  Jamie Thomson wrote:

    Ridium wrote:
    I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

    Thanks for you help,
    Ridium

    Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

    Regards

    Jamie



    I discovered a more convenient method.  Instead of using all those REPLACE functions, just use an expression:
    mycolumn == "99999" || mycolumn == "-99999" ? NULL(DT_DECIMAL,2) : (DT_CY)mycolumn

    I can just add an additional "or" operation for each new term I want to search for.  This is also less prone to errors.

    Ridium
    Monday, July 9, 2007 6:38 PM
  • Use this instead:

     

    [member_gender]=="M" ? "Male" : [member_gender]=="F" ? "Female" : [member_gender]

     

    The ? indicates a conditional. So you have [condition] ? [true_value] : [false_value]. You have to nest any additional conditions into the true or false part of the first conditional.

     

    Saturday, September 29, 2007 12:53 AM
    Moderator

All replies

  •  Ridium wrote:
    I'm would like to replace all occurrences of "99999" and "-99999" with "" in a column using SSIS.  I can use the REPLACE function in a Derived Column to replace one of those strings, for example: REPLACE(mycolumn,"99999","").  Or to replace both I could use REPLACE(REPLACE(mycolumn,"-99999",""),"99999","").  This seems kind of cumbersome and would get very complicated if I were replacing more strings with "".  I'm guessing there is a better way.  Can anyone help me out?

    Thanks,
    Ridium

     

    There isn't a simpler way, that is exactly how you should do it. Its simple and it works and I don't think its cumbersome at all. Just my opinion.

     

    What syntax do you envisage for a REPLACE function that allows you to replace multiple strings? Also, in your example given above I can envisage it replacing the "99999" part of "-99999" and you being left with "-" which isn't what you want.

     

    -Jamie

     

    Monday, July 9, 2007 4:29 AM
    Moderator
  • I have about 20 non-printable characters I want to scrub from my data.  I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

    Thanks for you help,
    Ridium
    Monday, July 9, 2007 5:08 AM
  •  Ridium wrote:
    I have about 20 non-printable characters I want to scrub from my data.  I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

    Thanks for you help,
    Ridium

     

    Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

     

    Regards

    Jamie

     

    Monday, July 9, 2007 5:10 AM
    Moderator
  •  Jamie Thomson wrote:

    Ridium wrote:
    I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

    Thanks for you help,
    Ridium

    Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

    Regards

    Jamie



    I discovered a more convenient method.  Instead of using all those REPLACE functions, just use an expression:
    mycolumn == "99999" || mycolumn == "-99999" ? NULL(DT_DECIMAL,2) : (DT_CY)mycolumn

    I can just add an additional "or" operation for each new term I want to search for.  This is also less prone to errors.

    Ridium
    Monday, July 9, 2007 6:38 PM
  •  Ridium wrote:
     Jamie Thomson wrote:

    Ridium wrote:
    I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

    Thanks for you help,
    Ridium

    Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

    Regards

    Jamie



    I discovered a more convenient method.  Instead of using all those REPLACE functions, just use an expression:
    mycolumn == "99999" || mycolumn == "-99999" ? NULL(DT_DECIMAL,2) : (DT_CY)mycolumn

    I can just add an additional "or" operation for each new term I want to search for.  This is also less prone to errors.

    Ridium

     

    OK, glad you found something that you're happy with. A word of warning though, use parentheses around the first argument to the conditional operator or else you could find yourself in a world of hurt.

     

    Why do you think that is less prone to errors? And when you say "just use an expression", why would using the REPLACE function not constitute using an expression?

     

    Regards

    -Jamie

     

     

    Monday, July 9, 2007 6:50 PM
    Moderator
  • I meant use a conditional expression.  It seems pretty obvious that something like this:

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,B,C),D,E),F,G),H,I),J,K)

    Is a lot more complicated than my solution.  Imagine trying to include 20 or 30 functions.  This is much harder to read and understand.  You could easily lose sight of what parameter goes to what REPLACE function which might cause an error.

    Ridium
    Monday, July 9, 2007 11:50 PM
  •  Ridium wrote:
    I meant use a conditional expression.  It seems pretty obvious that something like this:

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,B,C),D,E),F,G),H,I),J,K)

    Is a lot more complicated than my solution.  Imagine trying to include 20 or 30 functions.  This is much harder to read and understand.  You could easily lose sight of what parameter goes to what REPLACE function which might cause an error.

    Ridium

     

    OK fair enough, can't argue with that. Note what I said about parentheses though!

     

    -Jamie

    Monday, July 9, 2007 11:54 PM
    Moderator
  • How would that OR functionality work when you have multiple if-then statements?  Do I need to go back to the REPLACE function?  For example, I've tried the following and I receive errors

    ([member_gender]=="M" ? "Male" :  [member_gender]) || ([member_gender]=="F" ? "Female" :  [member_gender])

    Thanks,
    Joseph
    Friday, September 28, 2007 2:10 PM
  • Use this instead:

     

    [member_gender]=="M" ? "Male" : [member_gender]=="F" ? "Female" : [member_gender]

     

    The ? indicates a conditional. So you have [condition] ? [true_value] : [false_value]. You have to nest any additional conditions into the true or false part of the first conditional.

     

    Saturday, September 29, 2007 12:53 AM
    Moderator
  • Thank you for taking the time to clearly explain the syntax usage.  Worked wonders...
    Monday, October 1, 2007 8:51 PM