How do I delete unwanted whitespaces between words in C#?

Answered How do I delete unwanted whitespaces between words in C#?

  • quarta-feira, 11 de abril de 2012 21:44
     
     

    Hi,

    I have whitespaces between words in a text file.  I need to replace them with '|' (pipe delimiter).  How do I do this?

    Thanks.


    Marilyn Gambone

Todas as Respostas

  • quarta-feira, 11 de abril de 2012 22:08
     
     Resposta Proposta

    Hi,

    you could use the Replace-method:

    http://msdn.microsoft.com/en-us/library/system.string.replace(v=vs.100).aspx

    Regards,

      Thorsten


  • quinta-feira, 12 de abril de 2012 01:18
     
      Contém Código

    You could loop through and use replace, or have a bit of LINQ do it for you as well :)

    string str = "Hi!! this is a bunch of text with spaces";
    MessageBox.Show(new String(str.Where(c => c != ' ').ToArray()));

    Cheers :)
    ~Ace


    If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
    Visit the Forum: TechLifeForum


  • quinta-feira, 12 de abril de 2012 01:34
     
     Resposta Proposta Contém Código

    Same way you could use Regex class:

    string str = "a b c d";
    str = Regex.Replace(str, " ", "|");

    Regex is accessible through adding namespace System.Text.RegularExpressions;


    Mitja

    • Sugerido como Resposta Cor LigthertMVP quinta-feira, 12 de abril de 2012 07:07
    •  
  • quinta-feira, 12 de abril de 2012 04:24
     
      Contém Código

    or simplier use Replace method of a string class:

    string str = "a b c d";
    str = str.Replace(' ', '|');


    Mitja

    Works, that method was already posted though so I tried creating a different solution :)

    Edit: Your Regex Replace method is a new solution for OP though :)

    Cheers
    ~Ace


    If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
    Visit the Forum: TechLifeForum


  • quinta-feira, 12 de abril de 2012 04:42
     
     

    Ups, really, I see now Thorsten posted this kind of solution.

    I will delete it now.

    btw, but your solution does not work correctly.


    Mitja

  • quinta-feira, 12 de abril de 2012 05:35
     
      Contém Código

    Hello,

    you can use Replace method, for delete unwanted space.............refer this link for it

    http://social.msdn.microsoft.com/Forums/en/csharpgeneral/thread/f88af5ea-8334-4439-9aa7-2932c3c2fbb9

    http://www.dotnetperls.com/regex-replace

    or use this code.........

    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Linq;
    using System.Text;
    using System.Windows.Forms;
    using System.Text.RegularExpressions;
    
    namespace ntier1
    {
        public partial class Form3 : Form
        {
            public Form3()
            {
                InitializeComponent();
            }
    
            private void Form3_Load(object sender, EventArgs e)
            {
                String word = "Mich  el  le Acc  uso-Ste  ve  ns";
                word = Regex.Replace(word, " ", "!");
                label1.Text = word.ToString();
            }
        }
    }

    Regards,

    Tarun singh Disclaimer: This posting is provided "AS IS" with no warranties or guarantees , and confers no rights


    • Editado Tarun00007 quinta-feira, 12 de abril de 2012 05:36
    •  
  • quinta-feira, 12 de abril de 2012 05:54
     
     

    Hi,

    Try this

              using System.IO;

              //Reading text file

              StreamReader sr = new StreamReader(@"D:\textFile.txt");
              string strread = sr.ReadToEnd();
              sr.Close();

             //writing text file

              strread = strread.Replace(' ', '|');
              StreamWriter sw = new StreamWriter(@"D:\textFile.txt");
              sw.WriteLine(strread);
              sw.Close();


    PS.Shakeer Hussain

  • quinta-feira, 12 de abril de 2012 07:27
     
     Resposta Proposta

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;

    using System.IO;

    namespace ConsoleApplication1
    {
     
       public   class Program     
        {
                 
            static void Main(string[] args)
            {

               //Provide the path of your text file

                StreamReader so = new StreamReader(@"C:\GkText.txt");
                string gk =so.ReadToEnd();
                so.Close();
                gk=  gk.Replace(' ','|');      
                StreamWriter dw = new StreamWriter(@"C:\GkText.txt");
                dw.WriteLine(gk);
                dw.Close();
            }

    }

    }

    • Sugerido como Resposta GrtSanGo quinta-feira, 12 de abril de 2012 10:21
    •  
  • quinta-feira, 12 de abril de 2012 08:06
     
     

    Definitely the string.Replace() is the correct solution.

    Other solutions - including regex - are going to be much less efficient.

  • quinta-feira, 12 de abril de 2012 10:03
     
     
    Do you want to replace each whitespace with a pipe, or blocks of whitespaces with one pipe?
  • quinta-feira, 12 de abril de 2012 10:57
     
     

    What do you mean by whitespace? Does it include tab, newline etc. This from Wikipedia explains problem:

    Definition and ambiguity
    As is common in technical literature, the two words "white space" have found widespread usage as the single term "whitespace", especially when used as an adjective, as in "whitespace character". Some specifications refer to "white space" while others refer to "whitespace"; there is no difference between the terms, although exactly which characters are being referred to does vary from context to context. For example, the form feed character is "whitespace" in HTML, but is not "white space" in XML.

    The most common whitespace characters may be typed via the space bar or the Tab key. Depending on context, a line-break generated by the Return key (Enter key) may be considered whitespace as well.

    I think all proposed solutions assume just space not tab. And as Louis asks, is each ws char replaced by a pipe or are successive ws chars replaced by a single pipe?


    Regards David R
    ---------------------------------------------------------------
    The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones.
    Object-oriented programming offers a sustainable way to write spaghetti code. - Paul Graham.
    Every program eventually becomes rococo, and then rubble. - Alan Perlis
    The only valid measurement of code quality: WTFs/minute.

  • quinta-feira, 12 de abril de 2012 11:56
     
     Respondido Contém Código

    If the whitespace consists only of regular spaces, and you want to replace every single space with a pipe character, you can use string.Replace:

    string str = "a b  c   d";
    string result = str.Replace(' ', '|');
    
    // result = "a|b||c|||d";
    

    If your whitespace consists of spaces and other characters, and you want to replace every single whitespace character with a pipe character, you can use a regular expression:

    string str = "a b\t c   d";
    string result = Regex.Replace(str, @"\s", "|");
    
    // result = "a|b||c|||d";

    If you want to replace blocks of consecutive whitespace characters with a single pipe character, you can use a regular expression:

    string str = "a b\t c   d";
    string result = Regex.Replace(str, @"\s+", "|");
    
    // result = "a|b|c|d";

    • Marcado como Resposta deskcheck1 sexta-feira, 13 de abril de 2012 15:05
    •  
  • sexta-feira, 13 de abril de 2012 14:55
     
     

    Hi,

    This works is there's only ONE whitespace between words.  Problem is, it varies from 2 to 13 spaces between words per row.

    I need to combine all whitespaces between words into just one space, so I can then replace each one with '|'.


    Marilyn Gambone

  • sexta-feira, 13 de abril de 2012 15:00
     
      Contém Código

    To reduce the spaces to only one where have been more you can use this function:

        Private Function CleanMyString(ByVal value As String) As String
            While value.IndexOf("  ") > -1
                value = value.Replace("  ", " ")
            End While
            Return value
        End Function


    Hannes

    If you have got questions about this, just ask.

    In a perfect world,
    users would never enter data in the wrong form,
    files they choose to open would always exist
    and code would never have bugs.

    C# to VB.NET: http://www.developerfusion.com/tools/convert/csharp-to-vb/

  • sexta-feira, 13 de abril de 2012 15:00
     
     

    Hi,

    Replacing empty space (" ") with "|" simply replaces each one space with pipe-delimiter.  I'll end up with multiple pipes for each space.

    I need only one pipe to replace all spaces between words.  But I don't know how to merge all spaces into one.  The number of spaces is not consistent (it varies from 1 to 13.  (This is an output from a Fortran program so I have no control over the generation of the text files.)


    Marilyn Gambone

  • sexta-feira, 13 de abril de 2012 15:02
     
     

    Hi,

    I need to replace blocks of whitespace with one pipe.  But the number of whitespaces is not consistent (varies from 1 to 13 or more).

    Thanks.


    Marilyn Gambone

  • sexta-feira, 13 de abril de 2012 15:03
     
     

    Hi,

    Whitespace in my case is spaces between words.  I've already remove trailing ones from front and end of each row.  Now, the spaces between words in each row are the problem because they are not consistent: varies from one to 13 or more.

    Marilyn


    Marilyn Gambone

  • sexta-feira, 13 de abril de 2012 15:05
     
     

    Hi,

    Thank you!  This is the solution to my problem. 


    Marilyn Gambone

  • sexta-feira, 13 de abril de 2012 15:07
     
      Contém Código

    It's not the most efficient thing, but you could split the string on spaces and then use a StringBuilder to concatenate all of them together.

    string initialText = //However you get it
    string[] tokens = InitialText.Split(' ');
    StringBuilder sb = new StringBuilder();
    foreach (string token in tokens)
    {
      sb.Add(token);
    }

  • sexta-feira, 13 de abril de 2012 15:16
     
     

    What Thorsten and me (and sure someone else too) showed you?

    The same code as you marked as answered!

    And btw, you didnot mention any diplicate whitespaces in your 1st post.


    Mitja


  • sexta-feira, 13 de abril de 2012 15:23
     
     Resposta Proposta Contém Código

    > I have whitespaces between words in a text file.  I need to replace them with '|' (pipe delimiter).  How do I do this?


    an answer proposed/marked above does not properly works with such text "a b  c   d      ";
    the correct version is below

    using System.IO;
    using System.Linq;
    using System.Text.RegularExpressions;
    
    static void Transform(string sourceFile, string resultFile)
    {
        var re = new Regex(@"(?<=\b)\s+(?=\b)", RegexOptions.Compiled | RegexOptions.Singleline);
        var lines = File.ReadLines(sourceFile).Select(line => re.Replace(line, "|"));
        File.AppendAllLines(resultFile, lines);
    }
        
      
    • Editado Malobukv sexta-feira, 13 de abril de 2012 15:34
    • Sugerido como Resposta Malobukv sexta-feira, 13 de abril de 2012 15:36
    •  
  • sábado, 14 de abril de 2012 11:26
     
      Contém Código

    Hi,

    In simple form to remove more than 2 spaces try this,

       while(value.Contains("  "))
                {
                    value = value.Replace("  ", " ");
                }
    

    Or 

    public static string ConvertWhitespacesToSingleSpaces(string strValue) 
    { 
    strValue = Regex.Replace(value , @"\s+", " "); 
    return strValue; 
    } 
    
    
    

    Above code value represents input value..?