none
How to take a text file containing IP addresses and server names and make it look like a host file using regular expressions

    Question

  • I have a text file that contains ip addresses the server name and a comment section similar to this: 

    192.168.1.1 MNHFCS100 # Router IP
    192.168.1.2 MNHFAQ100 # Comment

    How would i use regular expressions to find the three individual fields and parse and trim them to look like the below?

    192.168.1.1 MNHFCS100 # Router IP
    192.168.1.2 MNHFAQ100 # Comment

    Monday, September 23, 2013 8:36 PM

Answers

  • Because a StreamReader is not an enumerable (meaning you can iterate through it) collection of strings. It is a StreamReader.

    In regards to your output, for some reason you call "file.Close()" while iterating through the file itself. I think you want that line OUTSIDE of the loop, after it finishes. This is better practice however:

    using (System.IO.StreamReader file = new System.IO.StreamReader("TextFile1.txt")) 
    {
       
    .... do all the loopy stuff ....
    
    }

    This will close/dispose of the stream on its own and you do not have to call "file.Close()" at all.

    Your regex for an IP address looks fine, you just didn't get far enough into the file to read and output any IPs (because you were closing it too early).

    Don't forget to mark posts as helpful!


    Tuesday, September 24, 2013 9:00 PM
  • There is always more than one Regex method that works.  Take you pick.  I like to get my 2 cents on this one.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<List<string>> results = new List<List<string>>();
                StreamReader reader = new StreamReader(@"c:\temp\Regex.txt");
                string input = reader.ReadToEnd();
                reader.Close();
                Regex expr = new Regex(@"(?'IP'[\d\.]{7,15})\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$",
                   RegexOptions.Multiline);
                MatchCollection matches = expr.Matches(input);
                foreach (Match match in matches)
                {
                    Console.WriteLine("IP : {0};    Server : {1};   Comment : {2}",
                        match.Groups["IP"].Value,
                        match.Groups["Server"].Value,
                        match.Groups["Comment"].Value.Trim());
                    results.Add(new List<string>() {
                        match.Groups["IP"].Value,
                        match.Groups["Server"].Value,
                        match.Groups["Comment"].Value.Trim()
                    });
     
                }
            }
        }
    }

    Tuesday, September 24, 2013 9:23 PM
  • I made the match pattern more robust to handle cases not in your test data like a line not having a comment and the IP not starting in column 1.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<List<string>> results = new List<List<string>>();
                Regex expr = new Regex(
                     @"^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$");
                //Regex expr = new Regex(@"^[^#].*");
                StreamReader reader = new StreamReader(@"c:\temp\Regex.txt");
                while (!reader.EndOfStream)
                {
                    string input = reader.ReadLine();
                    Match match = expr.Match(input);
                    if (match.Success)
                    {
                            Console.WriteLine("IP : {0};    Server : {1};   Comment : {2}",
                                match.Groups["IP"].Value,
                                match.Groups["Server"].Value,
                                match.Groups["Comment"].Value.Trim());
                            results.Add(new List<string>() {
                            match.Groups["IP"].Value,
                            match.Groups["Server"].Value,
                            match.Groups["Comment"].Value.Trim()
                        
                            });
                    }
                }
                reader.Close();
            }
        }
    }


    jdweng

    Wednesday, September 25, 2013 8:53 AM

All replies

  • Hi,

    This should work:

    List<string[]> Results = new List<string[]>(); foreach(string s in System.IO.File.ReadAllLines("Path")) { string[] Parts = s.Split(' '); Results.Add(new string[] { Parts[0],Parts[1],String.Join("",Parts,2,Parts.Length - 2) }); }

    Results should be a list of all the lines, each an array with three strings in it: the IP, the name, and the Comment. After that, you can do whatever you want with it. Just remember to replace the stand-in path with the path to your file.

    EDIT:

    As a note, I don't see any difference between your two examples, so I am only going off of your description.


    Wasabi Fan


    • Edited by Wasabi Fan Monday, September 23, 2013 11:40 PM
    Monday, September 23, 2013 11:39 PM
  • What's the difference between the input text file and the desired output?  They look identical in your post.

    Paul Linton

    Tuesday, September 24, 2013 12:05 AM
  • My apologies, I am a novice programmer.

    This is what I currently have:

    int counter = 0;
                string line;
    
                System.IO.StreamReader file = new System.IO.StreamReader("TextFile1.txt");
                while ((line = file.ReadLine()) != null)
                {
                    Console.WriteLine(line);
                    counter++;
                }
    
                string sPattern = "(\\d{1,3}\\.){3}\\d{1,3}";
    
    
                foreach (string s in file)
                {
                    System.Console.Write("{0,18}", s);
    
                    if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern))
                    {
    
                        System.Console.WriteLine(s);
                    }
                    else
                    {
                        System.Console.WriteLine(" - invalid");
                    }
    
                    file.Close();
    
                    Console.ReadLine();

    I error out at the foreach statement.

    The text file looks similar to a hosts file.

    I just need to parse out the IP address, server name, and comment section

    Tuesday, September 24, 2013 8:05 PM
  • You error out? What is the error? Runtime exception? Compile time? What line is it happening on? Please include the FULL error when you post it please.
    Tuesday, September 24, 2013 8:22 PM
  • This is the error I receive:

    Error 1 foreach statement cannot operate on variables of type 'System.IO.StreamReader' because 'System.IO.StreamReader' does not contain a public definition for 'GetEnumerator' C:\Users\skochkr\Documents\Visual Studio 2010\Projects\regexConsole\regexConsole\Program.cs 37 13 regexConsole

    With this program, I am trying to read from a text file, from the text file, I am looking for the IP address, server name, and comment following the #. If there is a match, I just need the matched part to be outputted to the console without any other information

    • Edited by rskochko Tuesday, September 24, 2013 8:35 PM
    Tuesday, September 24, 2013 8:29 PM
  • OK, we're going to switch up your work a little bit. The error explains itself. Are you confused about why you are getting that error?

    I'm getting rid of your first while loop, moving the lines from it into the foreach, and changed your foreach to a while statement. Try this:

    int counter = 0;
                string line;
    
                System.IO.StreamReader file = new System.IO.StreamReader("TextFile1.txt");            
    
                string sPattern = "(\\d{1,3}\\.){3}\\d{1,3}";
    
    
                while ((line = file.ReadLine()) != null)
                {
                    Console.WriteLine(line);
                    counter++;
                    System.Console.Write("{0,18}", line);
    
                    if (System.Text.RegularExpressions.Regex.IsMatch(line, sPattern))
                    {
    
                        System.Console.WriteLine(line);
                    }
                    else
                    {
                        System.Console.WriteLine(" - invalid");
                    }
    
                    file.Close();
    
                    Console.ReadLine();


    • Edited by tnw Tuesday, September 24, 2013 8:40 PM
    Tuesday, September 24, 2013 8:36 PM
  • Yea, I'm still confused on why I did get that error. Can you briefly explain?

    I tried the new code and I do not get an error but the output is not what i expected.

    The console only gives me these 2 lines:

    # Copyright (c) 1993-2009 Microsoft Corp.
    # Copyright (c) 1993-2009 Microsoft Corp. - invalid

    It does not actually output the IP address fromthe regex pattern: string sPattern = "(\\d{1,3}\\.){3}\\d{1,3}";

    Also, when i press enter to close console, it spits out this error on the first while statement:

    System.ObjectDisposedException {"Cannot read from a closed TextReader"}

    • Edited by rskochko Tuesday, September 24, 2013 8:55 PM
    Tuesday, September 24, 2013 8:47 PM
  • Because a StreamReader is not an enumerable (meaning you can iterate through it) collection of strings. It is a StreamReader.

    In regards to your output, for some reason you call "file.Close()" while iterating through the file itself. I think you want that line OUTSIDE of the loop, after it finishes. This is better practice however:

    using (System.IO.StreamReader file = new System.IO.StreamReader("TextFile1.txt")) 
    {
       
    .... do all the loopy stuff ....
    
    }

    This will close/dispose of the stream on its own and you do not have to call "file.Close()" at all.

    Your regex for an IP address looks fine, you just didn't get far enough into the file to read and output any IPs (because you were closing it too early).

    Don't forget to mark posts as helpful!


    Tuesday, September 24, 2013 9:00 PM
  • Perfect,

    Thank you!

    This was very informative and helpful!

    • Edited by rskochko Tuesday, September 24, 2013 9:07 PM
    Tuesday, September 24, 2013 9:07 PM
  • Excellent. Don't forget to read documentation. It seems a lot of problems here were caused by not understanding the code at hand. Documentation is your friend. 
    Tuesday, September 24, 2013 9:09 PM
  • I have an additional question if you don't mind me asking,

    How do I get to only have the REGEX stuff displayed. Right now, the else statement causes all the other lines to be displayed as well?

    Tuesday, September 24, 2013 9:10 PM
  • If you don't want the else statement to display all the other lines, get rid of it. Don't overthink it man, it's really that simple :D
    Tuesday, September 24, 2013 9:15 PM
  • There is always more than one Regex method that works.  Take you pick.  I like to get my 2 cents on this one.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<List<string>> results = new List<List<string>>();
                StreamReader reader = new StreamReader(@"c:\temp\Regex.txt");
                string input = reader.ReadToEnd();
                reader.Close();
                Regex expr = new Regex(@"(?'IP'[\d\.]{7,15})\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$",
                   RegexOptions.Multiline);
                MatchCollection matches = expr.Matches(input);
                foreach (Match match in matches)
                {
                    Console.WriteLine("IP : {0};    Server : {1};   Comment : {2}",
                        match.Groups["IP"].Value,
                        match.Groups["Server"].Value,
                        match.Groups["Comment"].Value.Trim());
                    results.Add(new List<string>() {
                        match.Groups["IP"].Value,
                        match.Groups["Server"].Value,
                        match.Groups["Comment"].Value.Trim()
                    });
     
                }
            }
        }
    }

    Tuesday, September 24, 2013 9:23 PM
  • Wow, this works really well.

    The only issue that I am getting is that when I run it, I only get one line outputted. it is as if it stops at that point and does not search for other ones

    I am using this host file as an example:

    # Copyright (c) 1993-2009 Microsoft Corp.
    #
    # This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
    #
    # This file contains the mappings of IP addresses to host names. Each
    # entry should be kept on an individual line. The IP address should
    # be placed in the first column followed by the corresponding host name.
    # The IP address and the host name should be separated by at least one
    # space.
    #
    # Additionally, comments (such as these) may be inserted on individual
    # lines or following the machine name denoted by a '#' symbol.
    #
    # For example:
    #
    #      102.54.94.97     rhino.acme.com          # source server
    #       38.25.63.10     x.acme.com              # x client host
    
    # localhost name resolution is handled within DNS itself.
    #	127.0.0.1       localhost
    #	::1             localhost
    
    

    Tuesday, September 24, 2013 9:33 PM
  • You don't have a valid host file.  There are no entries since lines starting with a pound sign are comments.


    jdweng

    Wednesday, September 25, 2013 1:11 AM
  • my apologies, I copied wrong file

    Here is what it approximately looks like

    # Copyright (c) 1993-2009 Microsoft Corp.
    #
    # This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
    #
    # This file contains the mappings of IP addresses to host names. Each
    # entry should be kept on an individual line. The IP address should
    # be placed in the first column followed by the corresponding host name.
    # The IP address and the host name should be separated by at least one
    # space.
    #
    # Additionally, comments (such as these) may be inserted on individual
    # lines or following the machine name denoted by a '#' symbol.
    #
    # For example:
    #
    #      102.54.94.97     rhino.acme.com          # source server
    #       38.25.63.10     x.acme.com              # x client host
    
    # localhost name resolution is handled within DNS itself.
    #	127.0.0.1       localhost
    #	::1             localhost
    ### BEGIN LAB MANAGER CHANGES ###
    # BA 5.5.5 PL
    10.209.83.49	BCKXNSP101	# Some comment, comment
    10.210.82.50	BCKXNSP100	# some comment
    10.208.82.56	BCKXNSP103	# comment
    10.254.82.49	BCKXNSP105	# comment
    10.200.84.55	BCKXNSP110	# sd s& commment 
    10.202.84.56	BCKXNSP101	# comment
    10.200.85.140	BCKXNSP105	# commemn Server
    11.206.85.139	BCKXNSP107	# some comment
    ### END LAB MANAGER CHANGES ###

    Wednesday, September 25, 2013 3:14 AM
  • I made the match pattern more robust to handle cases not in your test data like a line not having a comment and the IP not starting in column 1.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<List<string>> results = new List<List<string>>();
                Regex expr = new Regex(
                     @"^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$");
                //Regex expr = new Regex(@"^[^#].*");
                StreamReader reader = new StreamReader(@"c:\temp\Regex.txt");
                while (!reader.EndOfStream)
                {
                    string input = reader.ReadLine();
                    Match match = expr.Match(input);
                    if (match.Success)
                    {
                            Console.WriteLine("IP : {0};    Server : {1};   Comment : {2}",
                                match.Groups["IP"].Value,
                                match.Groups["Server"].Value,
                                match.Groups["Comment"].Value.Trim());
                            results.Add(new List<string>() {
                            match.Groups["IP"].Value,
                            match.Groups["Server"].Value,
                            match.Groups["Comment"].Value.Trim()
                        
                            });
                    }
                }
                reader.Close();
            }
        }
    }


    jdweng

    Wednesday, September 25, 2013 8:53 AM
  • Thank you for all the details that you have included.

    I figured out some other issues that I was getting.

    Thank you again, very helpful!

    • Edited by rskochko Wednesday, September 25, 2013 7:49 PM
    Wednesday, September 25, 2013 7:16 PM
  • This is probably sounds like a dumb question, but i'll ask it anyway,

    Instead writing the data to the console, how would you write it to an external file in a location such as c:\temp\Regex2.txt"?

    Wednesday, September 25, 2013 8:06 PM
  • Something like this might work for you ...

                    Regex nonHashLines = new Regex("^(?<IP>(\\d+\\.){3}\\d+)\\s*(?<serverNom>[A-Z0-9]+)\\s*(?<comment>[^$]*)");
                    Match zzz;
                    string dataLine = "10.209.83.49	BCKXNSP101	# Some comment, comment ";
    
                    // for each line that comes in acquire the data if any
                    zzz = nonHashLines.Match(dataLine);
    
                    // if data was acquired, this is one way it might be accessed
                    if (zzz.Success) {
    //                    string IP = zzz.Groups["IP"].Value;
    //                    string serverName = zzz.Groups["serverNom"].Value;
    //                    string comment = zzz.Groups["comment"].Value;
                        string output = zzz.Groups["IP"].Value + " " + zzz.Groups["serverNom"].Value + " " + zzz.Groups["comment"].Value;
    
                        using (StreamWriter wrtr = new StreamWriter("c:\\temp\\Regex2.txt", true)){
                            wrtr.WriteLine(output);
                        }


    • Edited by Lincoln_MA Wednesday, September 25, 2013 8:50 PM
    Wednesday, September 25, 2013 8:49 PM
  • I made the output file CSV so the fields are seperated by commas.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    namespace ConsoleApplication1
    {
        class Program
        {
            const string inputFilename = @"c:\temp\Regex.txt";
            const string outputFilename = @"c:\temp\Regex.csv";
            static void Main(string[] args)
            {
                List<List<string>> results = new List<List<string>>();
                Regex expr = new Regex(
                     @"^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*#\s*(?'Comment'.*)$" +
                     @"|^[^0-9\#][^\d]*(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$" +
                     @"|^(?'IP'[\d\.]*)\s+(?'Server'\w+)\s*$");
                //Regex expr = new Regex(@"^[^#].*");
                StreamReader reader = new StreamReader(inputFilename);
                StreamWriter writer = new StreamWriter(outputFilename);
                while (!reader.EndOfStream)
                {
                    string input = reader.ReadLine();
                    Match match = expr.Match(input);
                    if (match.Success)
                    {
                        writer.WriteLine("{0},{1},\"{2}\"",
                            match.Groups["IP"].Value,
                            match.Groups["Server"].Value,
                            match.Groups["Comment"].Value.Trim());
                        results.Add(new List<string>() {
                            match.Groups["IP"].Value,
                            match.Groups["Server"].Value,
                            match.Groups["Comment"].Value.Trim()
                        
                            });
                    }
                }
                reader.Close();
                writer.Flush();
                writer.Close();
            }
        }
    }


    jdweng

    Wednesday, September 25, 2013 9:22 PM
  • I have one last question if you are willing. As a bonus, is there a way to have everything be conformed
    For example:

    In my host file, I have IP addresses that  range in character length, as well as the server name and comments. Is there a way to have everything be spread out evenly so when the I run the program, everything looks neat.  In the sense that iv the IP address does not have all twelve characters,  the program adds in lets say 0's to make everything even  like so:

    This is not necessary, I was just wondering becasue it seems like it would be difficult to do, but then again I don't know much... yet.

    IP : 010.209.083.049;    Server : ABCDE100;      Comment : comment_here 
    IP : 010.189.008.023;    Server : ABCDE100;	 Comment : comment_here
    and so on...
    
    instead of 
    
    IP : 10.209.83.49;       Server : ABCDE100;     Commment : comment_here
    IP : 10.189.8.23;       Server : ABCDE100;     Comment : comment_here
    and so on... (The spacing is off in the second one, the current set up)



    • Edited by rskochko Thursday, September 26, 2013 7:36 PM
    Wednesday, September 25, 2013 11:52 PM
  • Hi,

    I suggest that you start a new thread since your question is different with your original one.

    I think more people would give you help.

    Thanks and good luck!


    Caillen
    <THE CONTENT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED>
    Thanks
    MSDN Community Support

    Please remember to "Mark as Answer" the responses that resolved your issue. It is a common way to recognize those who have helped you, and makes it easier for other visitors to find the resolution later.

    Tuesday, October 01, 2013 1:30 AM