locked
Regex expression help RRS feed

  • Question

  • I need help making a regex expression. I want to find every 3rd comma. Here is my string I want to run the expression on...

    Day,Amount,Direction,3,100,West,5,43,North,5,44,NorthWest,7,54,South,2,5995,East,,,,,,,,,,54,35345,,,543,North,6,,West

    Now, I want a regex to find every third comma and then I will replace it with a semicolon ";" with my own code. Eventaully I want the above string to be like this, with every 3rd comma a semicolon

    Day,Amount,Direction;3,100,West;5,43,North;5,44,NorthWest;7,54,South;2,5995,East;,;,;,;,;,54;35345,;,543,North;6,,West;


    Thursday, July 28, 2005 5:05 PM

Answers

  • Looking for a pattern based on previous patterns (backreferences) in a regex is rather complex.* Regexes are great and I often use them, but they are definitely more opaque than your normal programming constructs. Troubleshooting is also incredibly difficult if something goes wrong. (Read no debugger support.) I would recommend programming a normal loop with a counter:

    int commaCount = 0;
    string output = "";
    foreach(char c in str) {
       if(c == ',') {
          commaCount++;
       }
       if(commaCount == 3) {
          output += ';';
          commaCount = 0;
       } else {
          output += 'c';
       }
    }

    WARNING: I haven't tested the above code. Also if you're going to be building up really long strings, you may get better performance from StringBuilder.

    If you really want to go down the regex route, I would recommend checking out some tutorials on backreferences at http://www.regular-expressions.info and downloading Chris Sells' Regex Designer.NET (http://www.sellsbrothers.com/tools/#regexd).

    * See "Remove Duplicate Items from a String" at http://www.regular-expressions.info/duplicatelines.html for an example of how to remove duplicate values from a comma-separated list and you'll see how complex it can get. You want to do something similar, except changing the commas rather than the items.

    Thursday, July 28, 2005 5:40 PM
  • for some reason a lightbulb is covering something on line 4 what was there?
    Thursday, July 28, 2005 6:18 PM
  • Emoticons were changing my [ i ] to Idea. Changed my index to j to avoid the problem.
    Thursday, July 28, 2005 6:23 PM

All replies

  • Looking for a pattern based on previous patterns (backreferences) in a regex is rather complex.* Regexes are great and I often use them, but they are definitely more opaque than your normal programming constructs. Troubleshooting is also incredibly difficult if something goes wrong. (Read no debugger support.) I would recommend programming a normal loop with a counter:

    int commaCount = 0;
    string output = "";
    foreach(char c in str) {
       if(c == ',') {
          commaCount++;
       }
       if(commaCount == 3) {
          output += ';';
          commaCount = 0;
       } else {
          output += 'c';
       }
    }

    WARNING: I haven't tested the above code. Also if you're going to be building up really long strings, you may get better performance from StringBuilder.

    If you really want to go down the regex route, I would recommend checking out some tutorials on backreferences at http://www.regular-expressions.info and downloading Chris Sells' Regex Designer.NET (http://www.sellsbrothers.com/tools/#regexd).

    * See "Remove Duplicate Items from a String" at http://www.regular-expressions.info/duplicatelines.html for an example of how to remove duplicate values from a comma-separated list and you'll see how complex it can get. You want to do something similar, except changing the commas rather than the items.

    Thursday, July 28, 2005 5:40 PM
  • Hey, do you think you can change this to use a for loop instead of a foreach?
    Thursday, July 28, 2005 6:11 PM
  • You could easily use a for loop instead.

    int commaCount = 0;
    string output = "";
    for(int j=0; j<str.Length; j++) {
       char c = str[j];
       if(c == ',') {
          commaCount++;
       }
       if(commaCount == 3) {
          output += ';';
          commaCount = 0;
       } else {
          output += 'c';
       }
    }

    Note that strings are immutable in .NET. So you can't do the modification in place. For instance, you can't do:

    str[j] = ';';

    You'll throw an exception. You could create a string builder and modify that in place, but then you're paying the cost of the string builder creation. From what I remember, Stringbuilders start making sense perf-wise if you need to do more than 10 concatenations, but you'll want to evaluate this yourself. If this code isn't inside a tight, frequently called loop in your application, it likely won't matter regardless of which implementation you chose.
    Thursday, July 28, 2005 6:16 PM
  • for some reason a lightbulb is covering something on line 4 what was there?
    Thursday, July 28, 2005 6:18 PM
  • Emoticons were changing my [ i ] to Idea. Changed my index to j to avoid the problem.
    Thursday, July 28, 2005 6:23 PM
  • thanks dude!
    Thursday, July 28, 2005 6:27 PM
  • Guees who I am. eggie5
    Saturday, May 27, 2006 10:12 PM