locked
Remove whitespace between quotes RRS feed

  • Question

  • Hello.  I'm having some difficulty putting together a regex.replace to solve my problem and was hoping for a little help.  I'm working on a project to parse space-delimited files, on a line by line basis.  I need to be able to remove all whitespace between double quotes.  Here are two samples of the data i'm working with:

     

    Here is how the data looks like before the replace:

    184523 GET http://www.web1.com/css/test.php?126 - "text/css;charset: UTF-8" - "ps" "-""

    179660 GET http://www.web2.org/Acoustic+SRS.pdf - "application/pdf;name=Acoustic SRS.pdf" - "ed" "-""

    ***Note the whitespace between the bolded items

     

    Here is how I need the data to look like after the replace:

     

    184523 GET http://www.web1.com/css/test.php?126 - "text/css;charset:UTF-8" - "ps" "-""

    179660 GET http://www.web2.org/Acoustic+SRS.pdf - "application/pdf;name=AcousticSRS.pdf" - "ed" "-""

    ***There is no whitespace between the bolded items

     

    Any assistance that you could provide would be greatly appreciated.

    Wednesday, March 12, 2008 3:21 PM

Answers

  • For every line in the file ("line" is a string variable containing the line read from the file):

    Regex.Replace(line, "\"[^\"]*?\"", delegate(Match m) { return m.Value.Replace(" ",""); });

    or if whitespace includes tabs and other whitespace:

    Regex.Replace(line, "\"[^\"]*?\"", delegate(Match m) { return Regex.Replace(m.Value,"\\s+",""); });



    -Philippe

    Wednesday, March 12, 2008 9:54 PM

All replies

  • For every line in the file ("line" is a string variable containing the line read from the file):

    Regex.Replace(line, "\"[^\"]*?\"", delegate(Match m) { return m.Value.Replace(" ",""); });

    or if whitespace includes tabs and other whitespace:

    Regex.Replace(line, "\"[^\"]*?\"", delegate(Match m) { return Regex.Replace(m.Value,"\\s+",""); });



    -Philippe

    Wednesday, March 12, 2008 9:54 PM
  • Philippe - Thank you very much, this was extremely helpful!  I made a couple of modifications, as I did the parser in VB, as well as replaced some of the quotes with Char(34)'s in my declaration:

     

    Function ParseFile 'Excerpt from my parsing function

    Dim myRegex As String

    myRegex = "\" & Chr(34) & "[^\" & Chr(34) & "]*?\" & Chr(34)

    Dim Reg As Regex = New Regex(myRegex)
    Dim myEvaluator As MatchEvaluator = New MatchEvaluator(AddressOf ReplaceSpace)

     

    For Each line As String In lines
     row = dt.NewRow()
     line = RTrim(line)
     line = Reg.Replace(line, myEvaluator)
     row.ItemArray = line.Split(" "c)
     dt.Rows.Add(row)
    Next line

     

    End Function

     

    Public Function ReplaceSpace(ByVal m As Match) As String
     Dim s As String
            Static i As Integer
      i = i + 1
            Return i.ToString() & i.ToString()
    End Function

     

    Thursday, March 13, 2008 3:45 PM